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ABSTRACT 

A criterion-referenced measurement and diagnostic 
system for car.eer education was developed using 79 of the 177 basic 
learner outcomes ^identified in Texas. Approximately 500 test items, 
referenced to^the outcomes, were developed and submitted for student 
and professional review and statistical analyses following item . 
tryouts and field testing of the instruments. ft» sample of schools was 
selected for each instrument at each of two levels, with 10 
instruments at the lower. level (grades ^ and 10) and 12 instruments 
at the upper level (Grades 8 and' 11).. In all, 506 classes were 
distributed among 130 campuses in 8U school districts. Various 
statistical procedure^ were used in item and instrument validation 
for item tryouts and field testing. Forty-four of the learner 
outcomes were tried out with students who had received instruction 
specifically designed to develop ,"^he behavior described, by these 
outcomes. Data were .obtained on 51 objectives measured by 2l5 items 
for. the 44 learner outcomes. The tqst results were reported to give 
the student and school personnel diagjiostic information about student 
performance on the outcomes by using the school curriculum — reference 
evaluation format. Over two-thirds of ;:he document contains a^ppemded 
materials related to the processes involved in the study. 
(Author/EC) ^ * . 4 
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ABSTRACT /. 

Introduction. 

A criterion-referenced measurement and diagnostic system for career Education was developed using 79 of 
the 177 Basic Learner Outcomes identified m Texas.. Approximately SjDO test items, referenced to the out* 
comes, were developed by professionaL item writers with hmitfed input from a select sample of Texas 
educators. These items were submitted to extensive student and professional review and statistical analyses 
tollov^lr^Q item trvout^-arrd fretd"te:5tinq or the instrurrfgnts ; y — 



Item.Development and Validation. ^ 



The 79 outcomes were described In more detail bx TEA and PARTNERS staff called ' expansions. One to ten 
behavioral objectives (approxirnately 220 in all) for each of these expanded oijtcomes were written by 
WLG;^RC. Item development workshops with Texas educators were held to generate »tem ideas, and a total of 
500 items from these ideas and the' literature were written by SCORE test development specialists. These 
items were submitted to student and professional review. Pr6fessional reviews were based on the following 
criteria. (1) objective-item linkage, (2) reading level (6th grade), (3) non-offensiveness, (4) clarity, and 
scorability of items. Student reviews were conducted with groups of Jive eleventh-grade students with at least 
two representatives of each sex and a black, a brown and a white student. Approximately 400 items were 
reviewed at 34 schools. ' ' 

V 

Sampling Procedures. 



Approximately 1,800 eighth and eleventh-grade Texas students in 60 classrooms from Education Service Cen- 
ters (ESCs) IV, X, XI, XIII, and XX were selected for the first item tryout. The items were arranged into fifteen 
"packages ' and each package was administered to four classrooms of students, one eighth-grade class from 
a campus of over' 75% Mexican-American, one eighth-grade class from a campuaof over 75% black, one- 
eighth-grade class from a campus of over 75% anglo, and one eleventh-grade class from a campus of over 
75% anglo, A second tryout focused on 200 additional items. For the spring (1975) field test, a statewide sam- 
ple of approximately 13,000 students was selected (not twenty regional samples) using a stratified sampling 
prtDcedure ,for drawing schools according to the "proportional allocation * of students from the following 
strata. (1) less than 33% Mexican-American^ less than 33% black, (2) less than 33% Mexican-American, 
greater than. 33% black, (3) greater than 33% Mexican-American, less thaa33% black. A sample of schools 
was selected for each instrument a^each of two levels, with tep instruments at the lower level (grades seven 
and ten) and twelve instruments at the upper level (grades eight and eleven). In all, 506 classes were 
distributed among 130 campuses in 84 school districts. 

• 

Statistical Procedures for Evaluation ofltems and Instruments. ^ 

A variety of statistical procedures was used in item and instrument validation for item tryouts and field testing, 
//em tryout analysis focused on. (1) measures and tests related to item difficulty — the relative difficulty of 
the items as measured by p-values (the proportion or perpent correctly answering the item) and the 
significance test for chance (guessing) level performance as determined by the "Z-test", (2}>chi-square test 
for uniform foil response distribution — a test indicating the deviation from uniform foil response distribution; 
and (3) variations of p-values and foil response distribution across ethnic groups (black, Mexican-Americans, 
and "others"). The statistical reports for the field test included the statistics used in the item tryouts and, in 
addition. (1) measures of internal consistency, point biserial correlation coefficient — a measure of the extent 
to which the students' performance on the item is correlateA^with performance on the outcome, (2) measures 
of instrument reliability — the Kuder-Richardson "formula 20", (3) cultural validity analysis, (a) chi-square test 
for detecting heterogenous foil response distributions across cultural groups or "cultural variation," (b) 
Cramer's V — a measur^ of cultural variation which incorporates the sample size, (c) measures of cultural 
variation with probabilistic interpretations which a^e especially usefuf for items with a small number of in- 
correct responses, (d) content analysis which describes "bad'' foil^, ethnic t^ias, sex bias, and/or diagnostic 
Items, and (4) regression analysis' — a statistical technique using p-values, number df foils, and z-scores for 
placement of items at appropriate grade level. 



ERIC 



0" 




Sdnsitivity-td-lnstruction. 

Forty-four of the learner outcomes were tried out with a special grdu^ of students who had received instruction 
specifically designed to develop the behavior described by these outcomes. Learning modules were prepared 
for students in the eighth and eleventh grades for o^Bjectives bel(e\fed to be amenable to instruction over a 
relatively short period of time. About ^38 teachers m 36 schools volunteered to function as experimental and 
control groups. The students in the experimental group- 'Were preteatSd, instructed, and posttested utilizing 
WLC/MRC test items, the students in the control groups were pretesfl^d, received no instruction, and were 
posttesteti utilizing the same rtems. The following statrstical pybot^d ur^vyg re used Tnanalyzmq the data (1) 
thfi internal Sensitivity Index (fSI) measuring item quality from the per3pi^%e of the total test s discriminating 
power, (2) the External Sensitiyity lnde;^ (ESI) and the Ro.udabush "S * m&lsurmg an individual item s ability to 
reflect learning (independent of the test), (3) the Objective Sensitivity ln3e?^(0SI) measuring the total test's ' 
ability to discriminate between learners and non-learners, and (4) statistic^^.tests of significance for detecting 
differences between sensitivity indices for experimental and control grou^. Data were obtained on 51 ob- 
jectives measured by 215 items for the 44 learner outcomes. ~« 



Systems for Reporting Fi6ld Test Results to Teachers. 

The test results were reported to give the student and school personnel diagnostic information about student 
performance on the outcomes by using a modified version of the SCORE (WLC/MRC) report which contains 
data on. (1) whether each student mastered each outcome, (2) the percent of outcomes mastered by each 
.student, and (3) the percent of students mastering each outcome. A TEA-designed report which contains con- 
cise statements reflecting the degree of outcorpe mastery rather than the mastery/nonmastery format used'm 
the SCORE repoVting system was also utilized. Teachers favored the SCORE format, although the response to 
the questionnaire was low due to therfact that it w$is sent out rather late in the school year. 

Statistical Procedures for Development of the Survey Instrument. 

A survey instrun^erft was developed to diagnos^ the need for further measurement of student performance by 
using one or more of the 22 category tests. A stepwise regression analysis was employed to select one or two 
items which correlate highly with the ^'outcome" scores. t 

Implications 

Some of the implications of this effort are. (1) benefits occur as a result of using objectives that have been 
developed from a large-scale study of the views of students, educators, and those outside of the field of 
education, (2) objectives should be organized in appropriate form before selection/development of items, (3) 
design of reporting strategies should begin with the initial development procedures, (4) special attention 
should be given to item development activities for an area such as career education, (5) from 30% to 50% of 
the items 4fr an objective-based s/stem will be discarded during a rigorous review by students. (6) student 
review of items is productive, (7) advances have been made in the kinds of statistical analyses that are 
available for item and test construction in an objective-based rpeasurement system, (8) additional benefits ac- 
crue when a state department of education, a regionaHy-based project, and a contractor work together. 
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chapter i 
/Introduction 



Background 



In 1972 the Texas State Board of Education identified career educat^on^ason^ of several top priorities for 



devfilopm ftnt An miTIai impigr n^fUif^g activity of this priority destgnation was a statewide survey conduciedW 
the Division of Program Planning and Needs Assessment of the Texas Education Agency (TEA) and the Part- 
ners m Career Education Project (PARTNERS)^ to find out what the citizens of Texas believed student 
development should be m terms of career education. The specific research question considered for the survey 
was. what skills, capabilities, knowledge, attitudes or other characteristics are considered to be basic 
requirements for 17-year-old Texas students? A listing of 279 possible student outcomes was prepared for the 
survey based upon the following: 

• an extensive review of all available career education literature 

• visits and consultations with career education practitioners both in Texas and in other states 

• the products generated during a series of more than thirty work-group conferences with students, 
educators, parents and representatives of the business and industrial community. 

More than 6,000 individuals (parents, students, educators and representatives of business and industry) from 
every region of the state reviewed the listing and rated t|ie outcomes as either "basic," "desirable," or "inap- 
propriate for Texas students. Of the 279 outcomes utilized for the survey, 177 were rated as "basic" and 102 
as desirable. None were rated as "inappropriate" for Texas students. To assist in organizing the basic out- 
*• comes, they were arranged into nine categories. 

A Request for Proposal (RFP) was issued by TEA detailing the requirements of a career education 
measurement system for Texas. The measurement system was to contain test items designed to measure 
student development in terms of the previously validated basic learner outcomes. In February of 1974 
WLC/MRC entered into an agreement with PARTNERS and with TEA for the development of a criterion- 
referenced measurement and diagnostic system for career education. 



Selection of Outcomes to be Measured 

Reduction of the 177 basic learner outcomes to a more manageable number prior to commencing test item 
development was a first step in the developmental process. A series of activities involving staff of TEA, 
WLC/MRC, and PARTNERS, knowledgeable educators, and representatives of the business and industrial 
community reduced the number of basic learner outcomes to 79. WLC/MRC was instructed to develop test 
items for the measurement of this reduced number. 



Item Development ancf Reviews \^ 

Following identification of the, outcomes to be measured, WLC/MRC developed some 220 parallel behavioral 
objectives to be used as guides in the creation of test items. Upon acceptance of the behavioral objectives, 
WLC/MRC, PARTNERS and'TEA pe^sorfhel conducted an initial test item development program in two stiges. 
In the first stage, groups of Texas educators consisting primarily of counselors and career education 
specialists were brought together in four regional education service centers (ESCs). After an initial orientation 
session, the greater part of one day was spent in generating items to measure specifically assigned 0b- 
jectives. Participants were urged to continue with the creation of test items during the followfrig two week 
period. Items generated in this fashion were sent to WLC/MRC for refinement and editing. The second stage of 
the initial Item development effort involved the creation of approximately 450 test items by the WLC/MRC 
professional staff. PARTNERS and TEA coordinated stringent review sessions with Te^as educators, through 
the ESCs, across the state. The review process^ required the objective classification pf items according to ai 
specially prepared evaluation form. Another aspect of the item review which yielded valuable results utilized 
panels of students who were encouraged to give their opinions freely about the intelligibility, appropriateness 
for various grade levels, and the relevancy of the items. * ^ - 

'^^^ 

0 

'Partners tn Career Education is a five-year cooperative project (or the development and dissemination of a career education learning system II is funded by the 
Texas Education Agency and sponsored by Dallas and Fort Worth Independent School Districts and Education Service Centers Regions X and XI 
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After the item reviews were completed, the items were edited, revised, or deleted according to the composite 
recommendations of the reviewing groups. Additional reviews of the items and objectives were then conductBd 
by PARTNERS and TEA personnel and by consulting career education professionals in preparation for an 
initial tryout of the items with students. : v 



Item Tryo.uts and Analyse s 



Items found to be acceptable were then prepared in test format for a tryout with a broader sampling of Texas 
students. This sample vyas carefully selected to Include sTudenTs^fi^rn afTgeogtapllic ar^ars of the state. All 
substantial minorities and all sizes of schools were represented. Test items utilized jn tl;iis tryout (Phase I) 
were administered to more than 1,700 eighth and eleventh-grade students. ' 

Simultaneously, 52 of the 220 behavioral objectives prepared by WLC/MRC were chosen for use in a sen- 
sitivity-to<instruction study. The students involved were pretested, instructed toward the particular objectives 
selected, and posttested. The instructional materials used^w^re PARTNERS/teacher developed learning ac- 
tivity packages. The pretests and posttests were identical. This was the only phase of the" Item tryout testing in 
which students were actually instructed toward objectives which^^th^e'test items were designed to measure. A 
control group of students who had not received instruction towa?^ the objectives was also used. WLC/MRC 
statisticians conducted tests of statistical significance for the observed differences in the proportion of 
gamers' (those who failed the pretest and passed the posttest) between experimental and control groups. 
Moreover, various sensitivity-to-instruction indices were computed and tests of statistical significance con- 
ducted on the difference in index values between experimental and control groups. 

Qompletion of %e Phase I tryouts marked a major milestone iri the item development stage and a thorough re- 
examination of the WLC/MRC objectives prepared for each outcome and the items tried out for each objective 
was undertaken. PARTNERS, WLC/MRC and TEA personnel revTewed the relationship of these major com- 
ponents of the system for the purpose of assuring that there was a clear and significant link between each out- 
come, its objectives and the test items. Approximately 25% of the objectives were revised as a result of this 
reexamination. A stmilar percentage of the items were either revised or discarded. Also considered during 
this stage was the practicality of test administration. A decision was reached to convert a number of items 
frpm matching or open-end response patterns to a multiple-choice format. It should be noted that both PART- ^ 
NERS and TEA personnel retained a willingness to utilize types of items which called for more difficult ad- 
ministr^ive modes in order to obtain more valid measurement. A number of short-answer items and attitudmal 
surveys were retained, as were teacher-completed longitudinal surveys of individual student behaviors, 
"Comic-strip" type items and videotape stimuli were continued as a part of the item bank. . 

Because of changes to existing objectives and new objectives being developed, new items were also needed. 
These new* items were developed by WLC/MRC, by PARTNERS, and by TEA personnel. Two reviews of these 
new .Items were conducted, one to verify the item-to-objective match, and another for item content validity. 
Item reviewers had available all of the previously accumulated review information. The reviewed and refined 
items were then tried out (Phase II) in essentially the same manner that Phase I was conducted., Some of the 
Phase I Items were again tried out during Phase II to gam additional response information. The number of 
students involved in Phase II was somewhat smaller than for Phase I, with approximately 1,600 individuals par- 
ticipating. 

Analysis of the results bf both Phase I and Phase II iten^ tryouts was conducted by WLC/MRC. The analysis . 
focused on three major concerns. (1) the relative difficulty of the items as measured by p-values and ^ 
significance tests for chance performance (^tudent guessing), (2) statistics measuring deviation from a 
uniform foil response distribution, and. (3) variation of p-values and foil response distributions across ethpic 
groups (blacks, Mexican-Americans, and "others"), in addition, a technique utilizing professional judgment 
and regression analysis was developed for deterrfiminig the appropriate grade level for the items tried out for ^ 
each outcome. AJthree-day review' session involving members of PARTNERS, TEA (including the Assessment ^ . 
bf Career Education Steering Committee), and WLC/MRC was conducted using the accumulated data and Sub- 

^jectivejudgment as to content analysis. A number of the items tried out were dropped, some were passed as 

,'tried out, and some were passed subject to editing and/oi;- revision. 

I 

Field Test ' 

An extensive field test o\ the refined items initiated th^ final developmental stagp. Twenty-two instruments 
utilizing 382 items — from an original bank of more tha(i 500 — for the measurernp(i^^^. 200 objectives were 
designed. Items were sequenced on each instrument in the order of outcome difficy{l^phin each category. A 
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sample of 13,000 Texas students was selected from grades 7, 8, 10, and 11. The WLC/MRC Report Coor- 
dinator, working in close consultation with TEA, designed the sampling procedures and selected the sample of 
schools. 



Analysis of Field-Test Results and Instrument Design * ' 

The statistical procedur es and software for scoring and analyzing the field tests wer e developed by the report 
coordinator and Wt c:/MhC: prngrammmg staff Stattstinal rflpnrtR rlftR ignfiri tor thft tiftid tfl<; t.q inrli^dfid thnst^ 
Statistics used in the Phase I and Phase II tryouts and the following additional components: 



1. 



2. 



3. 



~a"measure ofTriternaf consistency (point biserial) and a statistic which measures the extent ofln- 
fluence of the p-value on the point biserial '% 

a separate Item analysis for each group corresponding to various cultural variables, such as ethnic 
origin, sex, and educational emphasis In the home, etc. 

statistical indicators of "cultural variation," i.e., the^degree to which foil response distributidns (ex- 
cluding correct response) vary across cultural groups 

Some of the above procedures were developed during the course of the project in an attempt to deal more ef- 
fectively with questions concerning item and instrument validity for criterion-referenced tests. For example, 
the procedures mentioned in (3) above, were found to be useful in detecting culturally related problems with 
items, such as bias, bad foils, bad format, etc. ' • * * 

The results of the field test were analyzed by personnel of TEA, PARTNERS and WLC/MRC. The statistical 
data were then used to determine which items should be dropped, revised, edited^, or used without 
modification. This revision session resulted in sixteen instruments with a total of ^273 items for use in 
measuring the nine categories of learner outcomes. Of the 273 items, 187 were judged to be acceptable in that 
they passed the 'review with minor or no modification. Using this pool of aqceptable items, ^ stepwise 
regression analysis was conducted to determine which items were most appropriate for inclusion in a survey 
instrument intended for use in screening students prior to adrrjiniStration of the r|iore detailed category tests. 
Based upon these statistical procedures and the judgment of TEA professionals, the survey test was 
developed. It will be.tried out with a statistically controlled sample of "^exas students during the fall of 1975 for 
a Statewide needs assessment study. 
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CHAPTER II 



ITEM.DEVEL0PMEr5T AND VALIDATION 
Objectives ^ 



Activities required by^tlie WLC/MRC contract began following selection of the 79 priority outcomes to be used 

for the assessment of career education mthe stat^ o[^Texas .^Setect r onwas madety^T€A a^fci PART NERS pe r- 

sonnel based upon the votes of a l^rge number of T^as^educators and dther professiionfal grpups. Each of. * 
tfi ese 79 outcomes was then descnbed in greater detaiVby TEA and PARTNERS st^ff personnel m paragraph 
format. These expansions were descriptors of the intent of each outcome. 

Based upon the expanded outcomes, WLC/MRC prepared from one to .ten objectives for ea^h outcome. The ob- 
jectives were stated In behavioral J^ms and formed the bases for test Iterh development. 

The objectives (approximately 220 in number) were reviewed by TEA and PARTNERS to assfurg that each one 
represented an element of the outcome for which it was written. The review also evaluated tjjf^fficiency with 
which the objectives addressed all. of the elements of each outcome. 

Item Development 

Once the objectives were developed and reviewed, the plans for item development began. Two processes 
were simultaneously initiated. One. was to assign sets of outcomes and objectives to career education, 
specialists and counselors in the Iowa City area and request that items.be developed. The other process was 
to conduct four regional workshops in,Texas for the purpose of training Te)^as educators and specialists to 
develop items. ' ^ ^ 

The workshops co nsiste d of a one-day meeting with about 20 to St^eople being trained m each workshop. In 
the morning, item c[^vefopment procedures and techniques were di^ussed. Included in the discussion vyas a 
review of item formats and th^ procedures for appropriately matcfcffng format to an objective. In the afternoon, 
the' participants divided^ mto groups of three to six to wprk- oa item development. During that time, the 
WLC/MRC representative circulated and critiqued the work being done. Thia^item development work continued 
for about three hoi^rs at wh^ time some of the work \fi£as collectefJ.^ ' » 

At the end of thaday, each patlicipant was assigned specTfic outcomes/objectives and requested to attempt to, 
develop additijdnal items on an individual basis over a period of two to three weeks. These completed items 
weVe sent to^Wl!C/MRC for review and refinement prior to inclusion in the measurement system. 

^ Phase I iie'm development was completed utilizing the experienced test development specialists who had 
been involved with the WLC/MRC. SCORE program. Objectives were assigned to professionals from this 
program and within one month over 500 test items were delivered to TEA and PARTNERS. 

Review 

' 

'^'^ As the items were developed, they were submitted- for review b^ 

• a WLC/MRC career education specialist, 

• the TEA staff and * , 
^ the PARTNERS staff. 

The purpose of these reviews was to find out if the ^ 

• items measured the keyed objectives, 

• language of the item was at a readingl^v^l of sixth grade br below. 

• item communicated its intent, 
itpm measured was non-of fensive. 
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• format was simple and clear. 

• item was scorable. 

• instructions Jpr admintsfratipn were clear , ^ 

• item was tecnnically correct in use of terms. . f \ 

The results of these professional reviews were submitted to WLC/MRC for inclusion \n the revision/recom- 
mendations file. < * ^ - 

■ 12 ■ ■ ." 



The second phase of the review process involved students. 1^ April, 1974, guidelines were developed to obtain 
the caridid reactions of eleventh-grad'6 students to the test items proposed for the career education 
fpeasurement system /Early m May, 1974, a plan was finalised for obtaining evaluation data frorh students. This 
plan was outlined in the Criteria for It^m Acceptability." (See Appendix L.) Guidelines for student reviews as 
described in the plan were: ' V - _ / 

• Each item would be submitted to Student review, • . • — 

• Item reviews would be conducted by a person not employed by the school. 

• The person conducting the review would serve as a facilitator and recorder of student reactions. 

• Schools selected as review sites would contam students with different etiwic backgcounds (Mexican- 
American, black, and anglo) inh both sexesf 

• The review teams would include five eleventh-grade student^ v^ith at least two represer)tatives of each 
, sex and a black, a Mexican-American, and an anglo student. 

The student reviews were conducted by a TEA or a PARTNERS staff member. When the student review tqam 
at a particular school had been assembled, the individual conducting the review described th'e procedures, 
^ assured the students that they were not being tested but that they were being asked to critique new test items, 
stated.how their input would be used, and explained that the test items had been written by a^third party (the 
contractor). The last cqmment seemed to make the students feel free to comment on the items. 

AMi§t o/^questions wafe developed to guide Ihe student review sessions (Appendix A). These questions dealt 
with'ltem readability^'appropfiateoess, structure,, bias, and non-of|ansiveness. .Students were asked to read a - 
career education outcome and the item that was proposed for measuring it. Open discussion followed, with the 
recorder documenting student reactions for as many "of the aBbve areas as possible. After 15-30 minutes, 
^direct questioning was used to fill gaps in the areas of concern listed above. On the average, students 
reviewed eight^items in a two-hour session. Students tended not to tir^ as readily when they w^re asked to 
review items of differing format;s. 

In the juagment of TEA and PARTNERS staff members who conducted the sessions, the reviews were produc- 
tive and fully justified the time and effort expended. The students were generally open in their comments 
about Items. They saw implications that the staff and educator reviews did not see. Apprq||^ately 400 items 
were revjewed at ^4 school campuses. Schools ranged in location from thQse in large citie^o those in rliral 
areas. A'majority of reviews were conducted in metropolitan areas. 

The results of the student review sessions are summarized as follows:. ^^ 

Conditions ^ Numbeir of Items . Percentage 

Acceptable 129 30% 

Need Revision . , ' 267 . 63% 

Rejected 30 7% 

TOTALS.. ^ * 426 • . . ' WS% 

In most instances, ifem writers had files of student suggestions for improvement as well as reasons for their 
recommended. revisions. The result^s of the student rtivjew of items indicated that the obtaining of student in- 
puts IS a necessary step in the development of an objective-based mstrument. Although statistical analyses of 
Item tryout data will yield information pertinent to certain item characteristics, student interviews seem to be 
the most feasible and economical method of deternjining. answers to questions such as; • 

• Do students understand the intent of the item? ' * - 

• Is the item tpo advanced or too simple for the target age-group of students? 

• Do. certain words or phrases offend the target age-group? 

• Why do many students feel that there is more than 6ne correct answer? ' ^ 

.Finally, each of the twenty ESCs^was requested to provide a sufficient amount of staff time to conduct 
teacher/educator review sessions to obtain a critique of the items from classroom teachers, counselors and 
administrators, One*half day was allocatext for these sessions. . ^ • 

The twenty regions were divided into four characteristic classes. (1) Mexican-American, (2) ^fack, (3) rural 
white (anglo), and (4) big city suburban white, (anglo). Each item set (about sixteen items) was sjjbmitted to 
one review group of five educatiors in each of the four classes using the review form contained in Appendix B. 
In this way, every item was seen by four different groups of people. It^w^^ anticipated that this would provide 
irtp'Jt on every item frorn representatives of every major populatipn group in the state. 

All of \he information obtained from the four phases of review (career education specialists, TEA and PART- 
NERS^students, and educator^) was compiled'and sumrparized by WLC/MRC staff. When a disagreement or 
discrepancy in decision existed for an item, the WLC/MF?C 'professional staff reviewed all of the inputs from 
the* various groups^and disposed of the item jn a manner considered to be most consistent with the reviewing 
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groups positions. \Mien understandability was m question, emphasis was* placed upon student input. Jf, 
however, the problem was one of administration or clarity of the scoring guide, emphasis was placed upon 
educator input. 

The.summary information obtained from a detailed analysis of the review data was referred to the professional 
wri^rs for use in item revision ih preparation for item tryouts. Every item that was not passed by all review 
grcjups as suitable for use was referred to an item writer for possible revision. In a few instances, items were 
not changed because of insufficient information from the reviewing groups. Tryout data were required to deter- 
mi/ie final revision on these items. 

TIfyouts 

the trying out of^items was an important step in the overall development of the Career Education 
Measurement System because the process provided a substantial amount of insight about each item prior to 

Its becoming part of an instrument. This information was gathered from approxirrfately 1,800 eighth and 
> eleventh-grade Texas students and, m some instances, teachers. The particulars gathered about each Item in- 
" eluded appropriateness, readability, acceptability, and clearness of directions. 

Although information similar to thi3 was obtained through student and professional reviews, the jtem tryouts 
presented the items visually in test context and format. Inputs from the large number.of students who actually 
responded to these tests provided real life information about the test items. From the data obtained, the 
following decisions could be made: 

• include an item in tKe instruments being designed, 

• exclude-an item, 

• revise an item prior to inclusion in an instrument, and 

• determine the range of additional items needed for satisfactory measurement of an outcome. 

For the initial tryouts, the test items were organized into approximately fifteen booklets or packages by 
category and mode of administration. The classroom was the smallest unit of sampling for the item tryouts. Ap- 
proximately Sixty classrooms were. used. Each item package was administered to four classrooms of students 
as follows: . . * 

• one eighth-grade class from a campus over 75.% Mexican-Aqjerican, 

• one eighth-grade class from a campus over 75% black, ' 

• one eighth-grade class from a campus over 75% anglo, and 

• orie eleventh-grade class from a campus over 7^% anglo. ' • ' * . 

Administration time was 45 minutes or more for each package. Each student was asked to complete student 
identification information questions. In addition, approximately 20% of the students from each classroom were 
randomly selected for individual interviews of about ten minutes following completion. of the test. The tryout 
administration extended over a two-hour period, in most instances. Personnel from either an ESC, PARTNERS, 
WLC/MRC, or TEA administered the test packages in cooperation with the teacher in charge of the class. The 
package administrator conducted the personal interviews with the selected students. 

The tryout data were used for determining the extent to which each item met the following criteria for ac- 
ceptability: • . . 

^ • Not more than 10% of the students will indicate difficulty in understanding the item. 

• Not more than 10% of the (student), responses may indicate offensiveness or bias. ^ - 

• Not more than 10% of the students in item-tryouts will indicate difficulty with understanding item direc- 
tion$ as determined by interyiew. 

• No rriore than 5% of the teachers should express any difficulties in scoring the items. 

• Questions were asked of educator-admmistrators about ease of administration and clearness of direc- 
tions. No more than 15% of the responses should indicate any difficulty. 

\ - ■ ' . ♦ 

Additional Item Development and Tryouts ' ^ ' 

n ' - * ' 

As a result of the reviews described above, because of changes to objectives and due to new objectives b^ing 
developed, many new items were needed. Of the 400 items tried out, approximately 25% were discarded for 
various reasons. 

Because of the limited time available, PARTNERS sent a staff of five people to Iowa City to work on the review 
and revision of the new items with WLC/MRC staff members. Items were routed to a review/revi.sion committee 
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, as they were written and the necessary changes, revisions made immediately. More than ?00 items were com- 
pleted'. ' J ^ 

.As revisions were completed, items were typed irv a camera ready format of afcout fifteen items jn each test 
booklet for a second tryout phase. 

The Phase N item tryouts were accomplished utilizing students m ESCs I, X and XI. Administration oiAhe test 
booklets was accomplished by the PARTNERS staff m cooperation with the participating classroom teach.ers. 
The procedures followed were essentially the same for both Phase I and Phase 11. 

A find! tryout was conducted which included trying out PhasQ I items for which ther^ had been an insufficient 
number of respondents during Phase I. This tryout was also conducted by the PARTNERS staff. Results of all 
three tryouts were utilized in a final review session attended by personnel from WLC/'MRC, PARTNERS, and 
TEA representatives, including the ACE Committee. Decisions *^ere made about which items would become a 
part' of field test. - * * , 

Preparation for Field Tests 

By March of 1975 the information accumulated from three phasq3 of item tryouts, a sensitivity-to-mstruction 
study (see Chapter VJor details), and four workshop conferences liad been subjected to detailed examination 
and penetrating analyses. Many of the original test items had been abandoned, most of the remainder had 
been revised in some fashion, and a number gf new .items had been written. The total number of items 
available for the field test was 382. (See Appendix M for materials used in Texas with ESCs and local school 
districts during the field test). These were prepared in 22 separate instruments for administration to students 
m four grades at two levels. The le^vel one instruments were for grades*seven and ten and the level two in- 
struments were for grades Qight and eleven. Sampling procedures for the field test are discussed m Chapter 
III. . ' 

The following considerattons guided the design of^the 22 instrument battery of tests: 

• a standardized format 

• clarity of instructions fdr administration and scoring . 

, • item readability ' \ 

• item simplification / • , 

• item arrangement within each instrument ^\ 

• grade level appropriateness 

Grade^level appropriateness was determined by a regression analysis technique which is discussed m Chap- 
ter IV. 

Post-Field Test Reviews 

As a result of the field trials m t^e Spring of 1975, item analyses were provided to TEA and PARTNERS. Some 
tentativ.e guidelines for item vaJi<!lation were proposed by'WLC/Mf^C stafrsttcians. (See Chapter IV for a 
discussionjpf the statistical procedures.) „ 3 

Two teams of reviewers were formed, each having, representajion from the three organizations (PARTNERS, 
TEA, and WLC/MRC). The teams reviewed the findings using the following. (1) the statistical analyses (sum- 
mary sheets prepared by TEA),, (2) a content analysis examining the quality pf the content of the item in 
relationship to the outcome it purported to measure, as well as the vocabulary level of the items, and (3) 
teacher input from a questionnaire obtained from the spring field test. Each item was then categorized as ac- 
ceptable, editable with minor revisions, or inappropriate for the measurement system. 

Assembling the Category and Survey Tests 

Assembling the final tests consisted of selecting appropriate formats and organizing the items into sixteen 
category instruments and one survey instrument. The organization of items for the category instruments was 
based upon the general category, the sub-category, and the outcome for which sets of items had been 
developed. The ord^r or sequence of items withm an instrument was determined by the content dimension. of 
eac|) Item. The resulting arrangement was according to difficulty, specificity, and item length. Also considered 
was the relationship of items within a set or group which measured a sub-category, the stimulus for each Item, 
and the response patterns of linked items. 



15 



ERIC 



I 



* The Survey instrument was developed to diagnose student performance m relation to the vanous categories 
and sub-catggones as measured by the Sixteen category'tests. The items found to be the most appropriate 
(representative) from each sub-category were selected to provide indicators of probable student perforiyiance 
on the outcorhes contained withm a particular sub-category. Forty-five iterris were selected for the survey irt- 
strumenHo represent the 26 sub-categories mto whjch the nme general categories were divided. Performance 
on the survey test will be utilized to determine whether administration of one or more of,the category tests to a 
^student (or groups of students) indicated. ' ' .( 
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. . . CHAPTER III 

' ^' "SAMPLING PROCEDURES 

Item Tryout ^anjple * . . 

Approximately 1,800" eighth and eleventh-grade Texas students were sele.cfed for the first tryout sample. Ap- 
proximately 60 classrooms were used (the classroom was the smallest unit of samf5 ling for the item tryputs). 
The rtems we^ arranged into fifteen packages " and each package was cidjninistered to four Classrooms of 
students, on^ eighth-grade class fropi a campus of over 75% Mexican-American, one eighth-grade c!:lass from 
a campus of over 75% black, one eighth-grade class from a campus of'ovec 75^^ix, aoglo,^nd One eleventh- 
grade class from a campus of over 75% ^glo. Because ofjtreir-high.ethnlc fconcehtration, ESCs IV, X, XI, XIII, 
and XX vy6re selected as item tryout.sites for gr^des.e>gtit and^lev,en ,A sample ot campuses tn these districts 
"was selected proportional to stadent*(5rirollrT)ent.' y.. ^ 

This yvas not a random sample. No statistical controls werXdeemed necessaryTiere since Ihe purpose of item 
tryppts was to try out ite/ns, not to mak^ statewide inferences. A list of gampuses participating is given in Ap- 
pendix C by district and region. " " *^ 

§chools participating in the Phase II item tryouts were locatjsd m ESCs X and XL Tiiese were selected from the 
^ample used for Phase L in additidn, jour schools were ^dded m^ESC I to include a greater number of 
Mexican-American students. Each Phas^e II package was tried out with six classrooms: 

• eighth and eleventh-grade blacks; ' . ' , • " ■ . 

• eighth and eleventh-grade Me)^rcan-Arnerrcaris; • \ 

• efghth and eleventh-grade "others.^* ; ' ^ ' ' . < * 

Field Test Sample • . * \ 

A random sample of approximately 13,000, students was selected for the field test which was administered \r\ 
the spring, 1975. This 3ample was smaller than origmally^plarir^ed. Additional refinemqnf of the instruments 
was considered tp be essential poor to attempting a larger statewide field trial. Moreover, because of .the 
developmental stage of the measurement instruments neither, state nor'regional inferences were considered, 
fvievertheiess, statisticaf controls were applied »ri an attempt tc^dbtain a sample that would yield unbiased 
estimates with reasonably good 'precision. The main purpose of th^ field test was, however,' to secure in- 
formation, to be used for furtf^er (elining the measurement instrurrients. 

A stratified sampling procedure was utilized for selecting sphools from the following strata. 

• less than 33%>Mexican-American, less than 33% Wack; * ^ v 

• less than 33.**^ Mexican-American, greater than 33% black, * \ 

• greater th^n 33% Mexican-American, Jess than 33% black. ' . ' . 

A fourth category, "greater than 33,% Mexican-American, greater than 33% black," contained only a few 
schools; these were randomly allocated to strata two and three abqve. ' ' - , 

A sample (of schools) was selected for eachanstrument iniour grades at two levefs. graded seven and ten fc 
lower level instruments and grades eight and eleven for upper level instruments. The number of schools selec- 
ted within each stratum was deternnined by 'prop6rti6nal allocation'; with respect to the riumber of students 
withm each stratum. In other words/the number of schools selected within each stratum (for each instrument) 
IS proportional to the number of students ip each stratum.. the more students the more scho ols are sampled. 

The information relevant fo'the allocation,of schools to strata is giveain the-table belov\^. 
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Grade 


Stratum 


Populatipn* 


Proportion^ 


Schools Selected 


8(7) 


1 


146,456 * 


0.70 
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' • ' 2 . ' • 


28,872 


' ^ '0.14 
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* ^32,399 


^ 0.16' 
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11 (10)- 


. 1 


136,05^ 
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• 19,296 


- ' 0.10 


1 / 






31,227 


• 0.17. 
i 1 
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» Pfopoftjon equj^ls theflumt>«r of students »n the stratum divjded by the total ftumber of students m'flll strata (at a grado level) 



The reason for taking n ^ 12 schools^ m grade's ^even and eight, and n = 1 1 schools m grades ten and eleven 
was ta obtain an allocation which was closer to the values given in the "proportion" column. 

h 

The schools m each stratum vyere^hen selected with probability proportional to size (p p.s.). That is. larger 
schools were more iiKely to be seletted than smaller schools, and their relative likelihoods were proportional 
to their relative sizes. This process may be illustrated as follows. Suppose there are five schools in a certain 
stratum and two are to be selected. The school populations are 20, 30, 100, 150, and 20D, respectively. The 
populations may be represented graphically as rar^ges or distances between pomts as plotted m Figure 1 
below for example, school 3 falls m the raTige of 50 to 150 which corresponds to a population of 100 students. 
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Figure 1. Graph representing populations of schools 1-5. ' , 

The probabilities of selection are thus proportional to the lengths of the line segments corresponding to the 
populations. If one thirtKs of each unit on the line in Figure 1 as representing one student,/t is clear that each 
student has an equal chance of bHtng selected. This is as it should be, since a sample representative of 
students in the population is desired, (cf., Cochran, 1963.)^ 

Finally, one classroom was v0k*fiteered from each school.^The classroom selected waar typical" according to 
ethnic and other cultural considerations. There were, prioV to field testing, ten instruments at the upper level 
'Since the seventh and eighth-grade samples each had twelve classrooms per Instrument and the tenth and 
eleventh-grade samples each had eleven classroopis per instrument, there were 

(10x11) + (12x11) + (10x12) + (12x12) = 506 ^ 

classrooms selected altogether. These 506 classes were jdistnbuted among 84,school districts and included 
130 campuses. Since {he average class size was thought to be around 30, a^^ample of around 506 ^ 30 = 
15,180 was anticipated. (The number of students actually selected was som^hat lower than this number.) A 
hst of the schools selected and information concerning their involvement in thfe project is provided m Appendix 
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An attractive by-product oMhe sampling method discussed m the preceding section is that self-weighting" 
procedures may be employed to estimate p-values, percent mastering objectives, point bisenals, and KR-20 
reliability coeffK^ienTs, obviating the computation of more complicated weighted estimates. The theoretical 
basis for using "self-weighting'" estimators is given in Appendix K. 
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CHAPTER IV 
STATISTICAL PROCEDURES FOR EVALUATION 
OF ITEMS AND INSTRUMENTS ' 

introduction * • 

This chapter contains a description of the statistical procedures used in item and mstrurjient validation, along 
with some examples of how the techniques were applied to actual test data. ^ . 

Various statistical procedures were employed to secure a scientific evaluation of the items and instruments 
which comprise the Texas .Career Education Measurement System. Some of these procedures, such as p- 
values, point biseriais, and KR-20 reliability coefficients, are classical test constructilon statistics. During the 
course of the project, however, new approaches and procedures to statistical validation of items were 
developed. For example, the techniques for measuring the cultural validity of items ^ere developed through 
valuable interaction between the WLC/MRC project coordm§tor and Keith Cruse of JEA. The test for chance 
(guess) level of fun^ioning, a Z-test, was developed in order to test whether or not the p-value for a sample 
was above or belov/that which would be expected by chance if the students were guessing. All of the above* 
statistics were computed by the WLC/MRC Instrument Analysis program package. (See Appendix D.) 



Measures and Tests of Item/Instrument Appropriateness 

1. Measures and tests related to item difficulty (p-values and Z-test): ' 

The difficulty of an item is traditionaJly measured by the proportion (or percent) correctjy answering the item or, 
p-verlue, denoted p. This may be adjusted to account for guessing (cf,. Lord and Novick, 1968, and Magnusson, 
1967). In addition, WLC/MRC statisticians proposed a Z-test to test the hypothesis that the students, as a 
group, are at the chance (guess) level of functioning 6n a given Item. This test Is conducted by the following 
formula: 




when p is the p-value, f is the number of foils, and n is the number of respondents (sample size). If the 
hypothesis that p = 1/ (f + 1) is true, i.e., the population sampled Is functioning at the chance or guess level, 
the above statistic has (approximately) a standard normal distribution (for large n, say n > 50). If Z is positive 
and statistically significant, one may conclude that the students are operating above the chance level. On the 
other hand, if Z is negative and -statistically significant, one concludes that the students are operating **below 
the chance level.'* This may be an indication that the item is wrongly keyed or that the item format is inap- 
propriate. If Z Is not statistically significant, one concludes that the students are guessing. , * 

2. Chi-square test for uniform foil response distribution: V , ^ . ^ 

Ideally, pne would hope that the foils in a multiple-chdrce Item would draw about equally. To test this 

hypothesis (conditional on a given total number of foil responsesl, one may cornpute the chi-square statistic: 




X = L 

where f is the number of foils, 0, is the observed number of responses to foil i, anfJ E, =^ 2 0,/f, the "expected" 
num^ber of responses to foil i. under the uniform foif response hypothesis, i = 1,2,. . .,f. 
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3. Measures of internal consistency (point biserial correlation coefficient): 

The classical measure of 'internal consistency" of a test, i.e., the degree to which the items measure the same 
thing, is the point biserial correlation coefficient, denoted Cn^ (cf.. Lord and Novick, 1968, and Magnusson, 
1966). - 

In the Texas Career Education Measurement System (CEMS) items are grouped or clustered around learner 
outcomes so that each outcome m Effect becomes a subtest of a larger instrument. The point, biserial is the 
degree to which performance on an item is correlated with performance on the learner outcome, i.e., the con- 
sistency with which students correctly or incorrectly answer an item in relation to its outcome score. Moreover, 
point biserials were computed for each cultural (ethnic and sex) group.. 

The p^ivalue influences the value of the point biserial. In particular when p becomes close to 0 or 1, rp^ 
be'comes close to zero. The "WLC/MRC Instrument Analysis" computes a statistic called "maximum" rpi^^ 
which is simply the value fpb would achieve if p were equal Jto 1/2. It may be obtained from rp^ as follows. 



max 



■"ob 



■pb 



This statistic, when contrasted with the value of rpb. provides an indication of the extent to which the p-value 
is influencing the point biserial. Thus, if rpb is quite low, and max rpb is not low, this may be due to a low (or 
high) p-value, and not (necessarily) due to lack of internal consistency. 

4. Measures of instrument reliability (KR-20): ' ^ 

The Kuder-Richardson "Formula 20" or KR-2p was used to measure test 'reliability (cf,. Lord and Novick, 1968, 
and Magnusson, 1966). The KR-20 is an internal consistency measure of reliability. Thus, like the point biserial, 
it measures the degree to which the items all measure the same thing. Unlike the point biserial, the KR-20 
provides one measure for any given instrument. KR-20's were computed for each outcome instrument. Overall, 
37%* of the outcomes had KR-20's greater than 0.50.^ 

Cultural Validity Analysis 

Are the items and instruments measuring what they are intended to measure for students in each cultural, 
group? The' question of the cultural validity of items .and instruments is investigated using an approach 
developed by the coordinator and others, (cf., Veale and, Foreman, 1975). The approach focuses on the foil 
response distribution broken down by cultural group. Three cultural variables were considered in the cultural 
validity analysis of the Texas career education test items. (1) ethnic origin (Mexican-American, blaCk, and 
other), (2) sex (male, female), and (3) "educational emphasis index" (high, medium, and low). The data 
available from the "Student Information Sheet" given to each sftdent at field test time were utilized to obtain 
the aforementioned cultural information. (See Appendix E.) Only the first two (ethnic and sex) cultural 
variables are considered in the discussion which follows. The extent of variation in foil responses across 
cultural groups is said to measure "cultural variation" which may be evidence of cultural bias. 

1. Description of the statistical techniques^ 

The following example serves to illustrate the approach and statistical technique. Suppose ^hat the total num- 
ber in the sample is 500, with 125 blacks and 375 non-blacks. Suppose further that 75 blacks and 225 non- 
blacks answer the item correctly, yieldmg identical p-values of 0.6, and that the foil distribution is as in the 
table below: 

Item Data With Equal p-value and 
Heterogeneous Foil Response Distributions 





A 


B 


C 


Totals 


Black 


40 


10 


0 . 


50 


Non-black 


50 


50 


. 50 


150 


Totals 


' 90 


•60 


50 


200 



Clearly, blacks are strongly attracted'to foil A, while non-blacks are uniformly attracted to the three foils, This 
may be an indication of cultural bias, i.e., because of cultural factors only (or primarily) blacks are drawn to 
foil A. If this differential attraction to foil A were not present, the p-values for blacks and non-blacks mi^ht have 
been quite different. 
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y On the other hand, it may be that foil A ts a more reasonable response than B or C among students who have 
/ been instructed to the objective being measured. In this case, it might be that blacks have been instructed 
^ (and thus find that foil A is more attractive than B or C) while non-blacks have not been instructed (and thus 
are uniformly attracted to the foils due to guessing). Another possibility is that A is a "bad" or "tricky" foil, sup- 
pose that blacks had not been instructed to the objectives, while the non-blacks had been instructed. In this 
case, blacks may be drawn to the tncky ' foil simply because they have not been msfructed. In these cases, no 
cultural bias can be claimed. Cultural variation does not Imply cultural b/as.-The approach may thus yield 
valuable diagnostic information about the group or about the item (other than bias), as well as information 
about cultural bias (Appendix F). Several statistical techniques were employed to measure the degree of 
cultural variation m foil response distributions. One of these is the chi-square statistic based on the foil 
responses for the various cultural groups. (Formally speaking, this statistic tests the statistical hypothesis that 
cultural groups and foil response are mdeperrcfent or uncorrelated.) For example, the chi-square for the data in 
the previous table is 37.04 which is statistically significant at the .001 level. A measure of the degree of 
cultural' variation is Cramer's V statistic which is found to be 0.43 in this example. Other statistics which have 
probabilistic interpretations and operational significance irrespective of the sample size (in this context, the 
total numberof foil responses) were utilized to meast^e the extent of cultural variability, especially in cases 
where thp chi-square does not apply. For a more detailed description of the statistical procedures used to 
measure the cultural variation of items, see Appendix G. ^ 

In additiop to measures of cultural variation, conventional item analysis statistics (such as point biserials) 
v^era.u^ed* as supplementary indicators of possible cultural bias. For example, if the chi-square and Cramer's 
V statistics manifest a high degree of cultural variation for an item and, moreover, the point biserials vary 
across cultural groups, the item is probably culturally biased. (However, variation in the point biserials alone, 
without corresponding cultural variation in foi4 responses, does not constitute clear evidence of cultural bias.) 

A computer program has been written at WLC/MRC to compute the various statistics used to measure cultural 
variation. The daja from the field tests were analyzed according to the afprementioned techniques. Some ten- 
tative "cut-off " values (of chi-square, V, etc.) were suggested by WLC/MRC statisticians, but were used only 
as rough guidelines. Flexibility of application was .strongly encouragetl. 

2. Content analysis: 

The content analysis is handled by grade (and grade combinations). Appendix H consists of a set of tables for 
the upper and lower grade samples in which a probable cause of cultural variability (of foil responses) is 
presented for each item by booklet number and test item number. Following the tables are several sample 
items^which manifest cultural variation (statistically) and a brief explanation of the probable cause of the 
variability (bias, diagnostic foils, bad foil, bad format). In some cases, variability existed at two grade 
levels and is discussed for both grade levels together. Some items seemed to have more than one possible 
source of variability. These items are discussed under separate combinatioi>-hHaCfings. 

It should be made clear that the discussion of these items in Appendix H constitute (data-based) content 
hypotheses of one specialist. • * , * 

Item and Instrument Analysis: A 'Global' View 

In order to take maximum advantage of the available statistical data, a flexible, 'global* approach Is recom- 
mended. Pre-assigned "cut-offs" were used as rough guidelines only. Rigid application of such systems 
(hbwever tempting for eJTpe^ent decision making) was strongly discouraged. 

All of the statistics discujssed in the previous sections should be considered in making decisions about items. 
The follov^^ing three exarfiples serve to illustrate how this process should work. 

Example 1. (Item 12/'^ooklet 11, Grade 7) 

Item: Grace wants a job where she does not have to deal with strangers. 
Which career do you feel would BEST match Grace's goal? 



\) receptionist 

(B) bookkeeper ^ 

(C) public librarian - , f 

(D) salesperson ' 
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The p-value (overall) is .56. which yields a Z value vvell ab9ve chance level. It is noted, however, that the p- 
values are quite variable across ethnic groups, with minorities doing worse than angles. 

The chi-square for testing uniformity of foil respfonses is highly significant, due to the strong attraction to foil 
"A." The cultural validity indices are as follows. = 9.906 (significant at .05 level), V = .184, T = .035, T95 = 
-.012, L* = 0.000, L*95 = .000. There is some degree of cultural variability present. ' 

Minorities ("MA" and "BL") are more attracted to "C" and "0" than are "others." Moreover, blacks are more 
attracted to "D" whUe Mexican-Americans are more attracted to'"C" (although "A" is the rriost popular foil). 
Finally, the point bis^rial (overall) is reasonably high (.58), indicating fajrly good interrlal consistency. It is in- 
teresting to note, however, that the point biserial is only 0.39 for blacks, while it is 0.58 for ''others.*' The con- 
clusion IS that the ilem is ethnically biased. Minorities simply have had less experience* with these oc- 
cupations. ■ ' 

Example 2. (Item 12, jBooklet 72A, Grade 8) 

Item: Which oi|/E of the following quotations reflects anr individual's positive 

attitude tpyvard participation in thev economic system- of the United- ^ - 

States? 



(A) "Big Businesses cheat on their taxes, so I do too." 

(B) "Irish Wool is of better quality than local wool." 

(C) "MVe invested my savings in a local corporation." 

(D) "I thinklhat I should be able to get money any way I can." 
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The overall p-value is .71. well above chance level. The numbers responding to thejoils are no\ sufficient to 
perform chi-square for testing cultural validity. However, L* is high, 0.212, the lower* 95% confidence interval 
is 0.05. / y ^ ' 

Note the differential attraction of "A" and "D" for Mexican-Americans and blacks. Finally, note that the point 
biserial varies from 0.53 (Mexican-American) to 0.36 (black) to 0.29 (other). There is sortie evidence of cultural 
bias in this item, although total number of respondents was low. 

Even though there are sound statistical reasons for eliminating this item from the instrument, it may b0 argued 
that it is preferable to retain the item and use the diagnostic information to provide guidelines for instruction. 
The middle ground between throwing out the item and keeping it as it stands is to revise it. Perhaps,^an im- 
proved correct response (a more positive, constructive, creative idea for participating m the economic system) 
would help to reduce the cultural bias. 

Example 3. (Item 8, Booklet 1 1 , both grades). ^ 

Item: Graduation Is coining soon.- You. have no idea of what you want to do 
when you. Leave school. You are fearful about your future and have 
stayed awake at night trying to decide what to do. - . . ^ ^ 

Below are actions that you might take in an effort to solve your problem. 

Identify the action that Is LEAST helpful by darkening the appropriate 

letter on your Ans'wer Shfeet. 

* • 

(A) talk with the school counselor ' " * 

(Bj write to universities, community colleges and trade schools to learn about opportunities 

(C) find ogt what your best friend to going to do - , 

(D) get inforfnatlon and advice from (he local state §mplp/ment office • 



\ 



rt)RM 01 • " • SAMPLE OF ITEM PRINT OUT 

OBJECTIVE 0107000 



1 
































































SIG 




95% 


OHI 


SIQ 




max.pt 


ITEM 


GR 


SP1 *SP2 


' N 


'A 


B 


C 


0 B 


F tNV 


DM 


OMIT 


z 


LEV 


(2 TAtL) 


1 TAfL) 


SO 


t«v 


SEP 


Bl SEP 


008 


10 




!}10 


-6 


5 


83* 


4 


0 


0 


0 


23 54 


000 


0 77 Ofte 


0 78 


5 80 


055 


054 


0 72 


008 


10 


M 


'151 


10 


7 


79* 


5 


0 


0 


0 


15 17 


000 


0 71 087 


0 72 


305 




0 57 


0 69 


008 


10 


P 


159 


6 


4 


87* 


3 ' 


0 


1 


0 


1799 


000 


0 79 095 


080 


2 78 




0 55 


081 






.TOTAL 


310 


8 


, 5 


83* 


4 


0 


0 


0 


23 54 


000 


0 77 0 88 


0 78 


580 


055 







GRADE 10 • ' • ' 


- ' - POILS 


A 


B ' 


D 


tt 








Gf^OUP 


M ■ 


15 


10 


7 


0 






P 


10 


8 


4 


1 






COLUMNS 


A B D HAVE BEEN USED PQR THE 


CHI SO 








• CHI-SQ 


0052 


SlG LEV = 




V = 


0 032 


DP - 2 000 


T » - 


0.001 


C, 1 = 


-0006 


L* - 


0 000 


C 1 - 0000 


T* = 


0 001 








0000 





f 



This Item is working well according to all criteria. The p-value is significantly above chance, the foils are 
drawing uniformly, the point biserial is fairly high (.54) and all the cultural (both ethnic and sex) validity indices 
are low. A statistically sound item. 



CHAPTER V 
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SENSITIVITY TO INSTRUCTION - 

Introduction 

An important element of the item tryout program required the utilization of the WLC/MRC test items with a 
special group of students who had received instruction specifically designed to develop the behavior 
described by a selected number of the learner outcomes. This particular phase of the item tryouts — referred 
to as in-depth tryouts — was expected to provide information for determining whether the test iterrjs measured 
a dimension of Knowledge that was sensitive to instruction. To accomplish this phase of the tryouts, the PART- 
NERS project was committed to the preparation of learning modules which directly addressed elements of 
forty-four of the seventy-nine basic outcomes for which test items were being prepared. Modules were to be 
prepared for students in the eighth and eleventh^grades in various subject areas. 

Asstittiptions 

The decision to conduct a study of the sensitivity to instruction of the WLC/MRC developed test ilems was 
based in part upon'the following assumptions: 

Criterion-referenced test items should measure student development in terms of clearly stated ob- 
jectives. 

• Criterion-referenced test items should reflect changes which may take place in student capability with 
regard to objective attainment. 

• The behaviors described by the WLC/MRC prepared objectives were elements of the basic learner out- 
comes and could be developed in students within the classroom. 

• Learning modules could be developed that were adequate for the identified objectives and appropriate 
for the students.,to be instructed. 



Procedures 

The theories and procedures suggested by Roudabush (1973) provided the basis for this study. The statistical 
analyses proposed by Kosecoff and Klein (1974). were among those applied to the data developed. The 
following procedures utilized in conducting this siudy are presented in the approxim3te sequential order of oc- 
currence. \ X../^ 

• WLC/MRC behavioral objectives, derived from the basic learner outcorpes, were selected which were 
believed to be amenable to instruction within a relatively short period of time. 

• Schools were identified and teachers (classrooms) were selected to function as experimental groups. 
These .groups of students were pretested, instructed, and posttested utilizing WLC/MRC test items. Par- 
ticipating teachers were volunteers. 

• Personal interviews were conducted with participating teachers to identify the curriculum in use and 
the resources appropriate for infusion of nece^ssary new material. 

• Resources such as books, curriculum guides, etc., were obtained for the development of infused learn- 
ing activities. 

• Schools and cfassrooms'.were identified to function as comparison groups. Teachers in the comparison 
group classrooms were also volunteers. Students were not exposed to material contained in PART- 
NERS special curriculum modules. 

• Learning modules were prepared to infuse the selected career edupation concepts into the ongoing 
curriculum. 

• The learning modules were submitted to participating teachers for review and critical comment.' 

• An evaluation form was prepared to obtain teacher reactions 1o the modules. ^ 

• Career education test items (mini-tests), answer sheets, and scoring sheets were prepared by 
WLC/MRC. 

• A manual was also provided by WLC/MRC to guide teachers in the administration of the mini-tests.^ 

• Testing materials, learning modules, and curriculum resource materials were delivered to and collec- 
ted from the teachers participating in the study. 

. • Students* answer sheets w^re scored and the data statistically analyzed by WLC/MRC. 

• Teacher evaluation data w^re complied for u^e within the project. 

24 ^ \ 

' -16 - ' ' 



Selection of Outcomes/Objectives ' ^ 

In the selection of basic learner outcomes and derived WLC/MRC behavioral objectives^ toward which learning 
modules would be prepared, several factors we/e considered* First, some of the out9Qmes which describe at- 
titudinal behavior were ide.ntified as not being amenable tojnstruction over the relatively short time span 
available. Second, those outcomes which bad been identifipd previously .as befng more appropriately ia- 
troduced and emphasized m the lower grades. tended to be eliminated as inappropriate for instruction in the' 
eighth and eleventh-grades. Finally, outcomes were selected ^or the tryOut program which, in the judgment of 
the professional staff, could be at least partially (measurably) developed during the period allocated, i.e,. ap- 
proximately ten weeks. After screening the total number of outcomes for which test items were being 
developed, 52 objectives (elements of 44 outcomes) were selected f<^r^is in-depth item tryout study. 

Selection of Schools and Teachers 

Two faetors of primary concern m the selectiori of schools for*thi& study were the degree of vvillingness to par- 
ticipate displayed. by the individuals contacted and the geographic Jocation of the schools concerned. The ap- 
pearance of a reluctant attitude on the part of either administrators or teachers was considered to be grounds 
for ihe non-selection of particular schools. Volunteer^ were sought who would accept thejiecessary curriculum 
and schedule modifications which would result from use^f^Lthe specified learner activities and student testing. 
With regard to geographic location, the anticipated for frequent visitl^to the participating schools by 
PARTNERS staff members inhibited the consideratiorrof schools more than one and one-half hours driving time 
from Arlington. Other considerations* involved school size and the ethnic composition of the student body in 
grades, eight and eleven. Because of the noted restrictiojis to sphool selection the inclusion of a proportionate 
number of students from each ethnic group was not possible. However, the desirability of obtaining responses 
from each of the three major ethnic groups — anglo» black and f^exican-American — was recognized and was 
a consideration m school selection. School" size was also important in that small classroorjis would have 
required the participation of an unacceptably'large number of teachers to assure that a minimum number of 
students responded,, to each test item. FoHowing consultation with WLC/MRC personnel, this minimum was 
determined to be ho students. 

By applying the foregoing general criteria 33 schools in sixteen school districts were id^entified. No difficulties 
were experienced in obtajnin^ the approval of administrators in any district or school contacted.^ne l;iundred 
thirty-eight teachers in the 33 schools volunteered to participate. The expressed desire to become involved in 
this aspect of the PARTNERS program by all of the administrators and a large majority of the teachers con- 
tacted was particularly gratifying. : * ' ; > 



Experimental and Comparison Groups 

The study design required students in each of the classrooms participating to function in a dual capacity, as 
members of both experimental and control groups. For example, an experimental class was pretested>s(n- 
structed and posttested utilizing, appropriate test items. The. same class also functioned as a control f 
another experimental group by being pre- and posttested utilizing test items unrelated to the instructional 

.material to which the class had been exposed. This methodology was feasible because the association be- 
tween Items written for different learner outcomes is' weak to non-existent. In siddition, the total number of 
students and classrooms required for the study was reduced by approximately 50% by utilizing this particular ' 

* technique.^ ^ ^ 

Statistical Plrocedures 

In critenonH^eferenced testing strong emphasis is placed on the effectiveness of test items to discriminate. bet- 
ween those students who have profited from instruction and those studerits who have not.^Three typ§s of in- 
dices were used in this study to determine**'sensitivity-to-instruction.'* v , . ■ ' * ^/ 

• The Internal Sensitivity Index (ISI) measures item quality from the perspective of the total test's 
discriminating power. 

• The External Sensitivity Index (ESI) and the Roudabush ""S" measures an individual iterp's ability to 
reflect Jearning (independent of the test). ^ / 

• The Objective Sensitivity. Index (OSI) measures the total test's ability tadiscriminate betweeh learners 
and nonlearners. 



This study utilized experimental and comparison groups tor each test with both groups receiving the pre- and 
posttests and the experimental group receiving instruction. A Z-test. was utilized to detect statistically 
Significant differences between the indices reported for fhe experimental and the comparison groups. (See Ap- 
pendix 1.) ^ • ' . . 

The Internal Sensitivity Index (ISI) is computed as follo\(ys:. . • , .\ 

) *< 

' isr= n2 - ni , . 
n 

where n-j is the observed frequenqy of students who answered Utem i correctly on the posttest but failed the 
^re- and pgsttest, n2 is the observed frequency of students who answered item i correctly on th6 posttest but 
failed the pretest and passed the posttest, and n is the total number of respondents Who correctly answered 
item i. . * ' • ""^^v.^ 

The External Sensitivity Index (E91) is computed as follo\ys: J 

•* 

ESI = m2 - m^ , 
m^' • 

. \ • 

where m-j is the o^^served frequency of respor^jjgnts who missed item i on the pretest .and posttest, m2 is the 
observedJjLequency^of respondents who missed item i 6n the pretest but responded cprrectly on the posttest, 
*and m is the total nupiber of respondents. 

The Objective Sensitivity Indes (OSI) is computed as follows: 



/ 



9SI = N2 - Ni, 
N 



where N-| is the number of respondents who failed the pretest and the posttest, N2 is the number of respon- 
dents who failed the pretest but passed the posttest, and N is the total number of respondents. 

The Roudabush ' S is an index of the degree .to which examinees are selecting the coreeo^esponse to the 
Item as a function of thejnstr.uction received between pre- and posttest, that is, a sensitivity incfex,. This index 
IS Simply the proportion of cases that missed the item on the pretest and then answered it correctly on the 
posttest after a correction for guessing had been applied. , / 

The values for each index range from -1 to t1. A score of -1 would occur when no one learned. Such a rest/lt f 
s&ggests that,either instruction failed to benefit any of the students, or, more realistically, that the item fails to 
discriminate among learners. A score of. t1 is obtained when all students miss an Item on the pretest and 
correctly answer it on the posttest. This is the ideal situation, \}p(^ item shows maximum change in the diVecfion 
of learning. Any scores on the pass-fail and pass-pass cell$ will lower the absolute values of the indices. 

The difference m the proportion of gainers (those passing the posttest and failing the pretest) out of the total 
-number ot potential gainers (those who failed the pretest) for the experimental and comparison groups was 
also computed. A 2^test of significance was conducted if the number who failed the pretest was large (greater 
than 20). For small samples, Fisher'a "exact test" w^s used (cf., Snedecor and Cochran, *f967). Similiar tests , 
were conducted on the proportion^of gainers (experimental vs. control) for each item. Specifically Z or Fisher's 
te^s were conducted using the difference m the proportions of (1) those passing the po$ttest among those ; 
failing the pretest and correctly, answering the item, and (2) those correctly answering the items on the posttest • 
among those missing the item on the pretest. * ^ #. 

A sample page of printout from the sensitivity-to-mstruction. analysis conducted by WLC/MRC statisticians Is 
given in the following table! * ^ - * 



TABLE WLC/MRC Sensitivity to Instruction Sample Printout 
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The significance tests (for differences in ISI, ESI, and proportion of gamers) were very useful sinpe the indices 
are quite new and little is i^nown about what constitutes a "good** value (of ISI, E6!, etc.). For example, if an in- 
dex IS high, say greater than 0.75, and significantly higher than for the comparison group, it can be inf^jjed 
that the item is really sensitivejo instruction. Contrariwise, if^there is no significant difference and both values 
of thet index are either high, medium, or low, it cannot be inferred that the item is sensitive to instruction. It may 
pe sensitive to some other (apparently common) factor, which is not related to instruction. In the case where 
the indices are low or negative, and the difference is significant, the interpretation is questionable. The in- 
terpretation of^the test for difference m proportion of gainers was more straight-forward since a significant dif- 
ference in these tests indicated that the item manifested a real difference in gam between those who had been 
given instruction and those who had not. The mastery level established fpr passing or failing was varied for . 
this study. These indices were computed by WLC/MRC for the 50%, 70%, 80%, and 90% levels. 

The participating teachers were asked to rate their students as follows, students who generally earned grades 
equal to or greater than B, and students who generally -earn grades less than B. All indicators pi sensitivity to 
instruction were computed for the total sample and for these two sub-grogps. (The analysis given in the sample 
printout IS for the or abiove'* group.) 

Results of the Study ; , 

Sufficient data for analysis purposes were, received on items addressejd to 51 of the 52 objectives selected for 
the study. The 51 separate tests were composed of 111 items, many of which required-multiple responses, The^ 
computer treated each of the separate responses as individual items* This resulted in a total item count of 215. 

ERIC ' ^ • 19 -27 



The Internal Sensitivity Index (ISI) aod Objective Sensitivity Index (OSI) are both dependenfupon established 
mastery leveJs to determine the number and percent of students passing and/or failing the tests. Testing 
results were analyzed at various mastery levels from 50% to 90%. These levels represent the percent of the 
total number of items written for a single objective which a student must answer correctly tS achieve mastery 
of the objective. As the mastery level criteria was lowered, the values of both the ISI and the OSI tended to in- 
crease. However, a lower mastery level — say 50% as opposed to 80% — resulted in more students.passing 
the pretest.. This caused the ind.ices to reflect learning for a smaller percent of the sample. With the mastery 
level established at dO% a higher percentage of the s^tudents failed the pretest thereby increasing the number 
who might profit from instruction and providing a more reliable indicator of sensitivity. 

The Internal Sensitivity Index (ISI) measures item quality from the perspective of the total test s ability to 
discriminate betweer, mastery and non-mastery oi the objectives. One hundred two items were found to have a 
positive ISI score at the 80% mastery level. The Z-test for ISI yielded questionable results, since many of the 
statistically significant iSis were negative or quite low. Using the test for difference in proportion of gamers, it 
was found that at the 80% mastery level, twelve items were found to be significantly different at the .,10 level, 
eight at the .05 level; and sixteen .01 l^el. 

The External Sensitivity Index (ESI) ^neasures an individual item s ability to reflect learning. One hundred one 
items were found to have positive ESI scores. Using the test for difference in proportion of gamers (on the 
items>, four it«ms showed a statistically significant difference between the experimental and the comparison 
groups at the .10 level; and fourteen ttems were significant at the .01 level. 

The Roudabush S ' is a measure ojs&p I'tem's sensitivity and includes a^GOrrection for guessing. Roudabush 
found that at least 50 cases are needed to establish a reliable mlex; i.e., at least 50 students who fail the 
pretest should be instructed and subsequently posttested. Nmeteen items m this study met this criteria and 
thirteen of these items had a positive index. * 

The Objective^ Sensitivity Index (OSI) measures the total test's (for an objective) ability to discriminate bet- 
ween Jearners* and non-learners. Nine objectives had a positive OSI score at the 80% mastery level. Using the 
test for difference in proportion of gamers (on the tests), two objectives were found to be significant at the .10 
level, five objectives were significant at the .05 level, and six were significant at the .01 level. 

When the total sample was divided into two groups by grade average , A or B students and C or poorer 
students according to teacher ratings, analysis of comparative data yielded the predictable results. The 
students rated B or above yielded higher sensitivity indices than those rated below B. For example, at the 80% 
mastery level eighteen objectives show a positive OSI score for A or B rated students and nine objectives 
showed a positive OSJ score for those rated below B. • 



Limitations of the Sensitivity to Instruction Study 

Several factors combine to severely limit the usefulness of the data collected for this study. First among these 
IS the ite.nfi/objective/outcome relationship which existed when the sensitivity to instruction study was initiated. 
FinaJ review and acceptance of the objectives and related test items prepared by WLC/MRC had not been 
completed by TEA or by the PARTNERS project prior to printing of the nf]ini-tesis to be used in the study. Sub- 
sequent joint review of the objectives and items by the parties concerned ()/VLC/MRC, TEA, and PARTNERS) 
resulted in the elimination of approximately 25% of the items developed by WLC/MRC to that time. In addition, 
major revisions arid fo»mat changes were made to more than half of the remaining items. Ihese revisions or 
changes were based apon the professional judgment bf the three parties participating in the review. The item§ 
and objectiveis eliminated or revised did not adequatefy* address the elements of the basic learner outcomes 
for which they had been preoared. The ultimate result of the changes made was to reduce by approximately 
65% the number of viable iltews used in this study. v 

A second factor lirpiting the usefulness of the data relates to the quality of the test items available. Prior to this 
stud^ the test items utilized had not been pilot-tested or tried out in any fashion with students. There was 
therefore.no information available with regard to the readability, understandability or appropriateness of the 
test Items in a testing environment. (The test items had been reviewed by educators, and by students at tJie 
junior and senior levels as sing/e items but not in a test context format.) Erratic student responses, charac- 
terized by unsymmetrical foil distribution patterns for many items, in both the control arrd experimental groups, 
are believed to be directly related to this factor. In addition, a large number of items were correctly. answered 
on the pretest by very high percentages of the students. For example, 84 items were cgrrectly answered on 
both the pre- and posttests by 80% or more of the students participating. An adequate tryout o; a pilot testing 
program would have identified many of the test items as being too easy, for eighth and/or eleventh-grade 
students.. 



A,Jh$rd factor, which is attributable m part to the fact that the items had not been previously tried out. also 
tends to limit the value of the data collected, Th4S involves the number of students included m tfie study. 
Roudabush found that for a reliable sensitivity mdex to be computed (Roudabush "S") the number of students 
failing the pretest (and therefore requiring instruction) should be at least 50. In many instances, fewer than a 
dozen of the students participating m this study failed to pass iWe pretest. In fact,„Only nineteen of the 215 
Items utilized met the criteria established by Roudabush, This situation could be avoided m the future by con- 
ducting an adequate tryout or pilot-test to eliminate inappropriate items prior to a study of this type. 

A fourth factor, was the question of instruction to objectives. The extent to which the quality and effectiveness 
of instruction varied across objectives directly influences the sensitivity indices. The variability of instruction 
^presents a confounding variable which disturbs the comparability of the sensitivity indices across objectives. 
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CHAPTER VI 
SYSTEMS FOR REPORTING FIELD TEST^ 
RESULTS TQ TEACHERS 

Introduction 

The ultimate success or failure of the measurement system win depend largely upon the usefulness of the in- 
formation that the tests generate. Thus, it is essential that test data reported to the potential users of the In- 
formation be written so that it can be easily understood. The systems used for reporting the results of the 
March field tests to students and school personnel were of a developmental nature, and criticism from those 
receiving the test results was encouraged. 

The purpose for reporting the test results is to provide students and school personnel diagnostic information 
about student performance in terms of the behaviors described by the learner outcomes. Two types of reports 
were used. (1) a modified version of^the SCORE (WLC/MRC) student report and (2) a TEA-devlsed report. 



"WLC/MRC Format 

The modified SCORE report contains information on (1) whether each student mastered each outcome, (2) the 
percent of outcomes mastered by each student, and (3) the percent of students mastering each outcome. A 
50% mastery level was used, i.e,, a student must have correctly answered at I6ast halfoi the items measuring 
an outcome to be classified as having "mastered" the outcome. The 50% level was used, rather than a higher, 
more stringent level, since no instruction toward the leatrner outcomes was assumed. A sample report 
(Westmghouse Learning Corporation SCORE Class List) is provided in Table 1.The outcome "legend;" i.e., the 
numerical outcome codes witfi the corresponding outcome descriptions, is provided (for test booklet 11) in 
Table 2, • . 



TEA Format 

The TEA report format contains concise statements reflecting the degree of outcome mastery rathqr than the 
mastery/nonmcistery format used in the SCORE system report. An individual report is provided for each mem- 
ber of the class which indicates his or her performance on the test. A copy of a TEA style report (for test 
booklet 52) is given in Table 3. ' 

The teachers were asked to evaluate the two types of systems^ The SCORE format was favored, although the 
response to the questionnarie was spotty due to the fact that it was sent out rather late in the school year. 
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SCORE REPORT — TABLE 1 



TEACHER CLASS LIST 



PROGRESS CITY ELEMENTARY 
TEACHER CLASS SUMMARY 




MR. DALE SMITH 



; MATHEMATICS 
GRADE 06 



OUTCOME OZ-02 07-04 07-05 ,r07-07 I 07-08 07-09 07-11 07-13 07-16 07-20 STUDENT 

' ' \ SUMMARY 

' ^ ' . ' OUTCOME 

' ^ * ' . PERCENT 

ABLE RON . ^ - . . - 50 

ADAMS SUE - ' - ^ - ' » 70 

BAKER DON - , - - \ 80 

BOONE • ' JOE - 90 

CRAIG DEB - * .90 

PARSON PAM ' - - 80 

WEST , ANN ' - * A. 90 

WILLIAMS TED • - A , 90 

PERCENT OF STUDENTS MASTERING OUTCOMES " ^ 




A The cl^ss list is a performance record for each student in a teacher's class for each outcome tested. 
B Numeric representation of the outcomes as listed in the teacher's outcomd legend. 
C Percent of outcomes mastered by each student. ' 

D Interpretation Qf the outcome mastery is as follows, if a minus appears under an outcome, the 
student has not mastered that outcome, A blank designates mastery of the outcome. 

E The percent of t\e class mastering each outcome is also summarized. 



Modified SCORE Report 
Table 1. - 
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OUTCOME LEGEND - TEST 11 - TABLE 2 

t 

01^3: ' ' / ' 

The student should understand the necessity for having a satisfying job when setting his career goal. 

01j04: ' 

The student should understand that he will work better when he accurately matches his personal goals 
with his career choice. / 

01-05: • - ' / 

The student should be able to identify career directions which. are available to him. 



01-07: 



The student should be, able to use his/tier own resourcefulness to solve personal problems such as. He 
wants to go to college^ but there is not enough money for tuition. He cou^d look for a job, put in a request for 
financial aid^ or apply for a lo^n. 



ERIC 



3.2 

24 



TEA REPORT - Table 3 
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TEA REPORT - Table 3 
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CHAPTER VII 
STATISTICAL PROCEDURES FOR 
DEVELOPMENT OF THE SURVEY INSTRUMENT f 

A survey or diagnostic instrument comprising about 45 items was daveJoped to be used at the eighth-grade 
level. The purpose of this test is to diagnose further measurement of student performance with one or more of 
the sixteen category tests. The category tests would then prescribe instructional strategies. 

A stepwise regression procedure (cf., Draper and Smith, 1966) was employed to select one or (at most) two 
items which correlate highly with the outcome' scores. The dependent variable m this framework is the out- 
come score, and the independent variables comprise (1) the "scores" on each item within the outcome (0 = 
wrong, 1 = correct) and (2) a control variable to indicate, and thus control for, the grade tested (upper or 
lower). The data may be fitted to the following regression equation: 

Y =Bo -H Bi Xi -H B2 X2 + • • • + Bp Xp + e, 

where Bq is the Y-intercept, Bi is the regression coefficient for the control* variable, B| + 1 is the coefficient 
corresponding to the ith item "score" ( i = 1, 2 p) and. 



X =r student is in lower grade 
^ (1 if the student is in upper grade 

y. _|0 ^he stuflent answers item i incorrectly 
' *^ ^ if the student answers item i correctly 

The variable Xi was always included. The other variables (no more than 2) were selected in a stepwise man- 
ner as follows: ^ . ' 

1. The variable with highest partial correlation with Y (holding X-j fixed) is selected. 

2. The variable with highest partial correlation, holding the item selected in step 1 fixed, is selected. 

* » 
Tests of statistical significance for each item entered were conducted. They were all highly significant due to 
the large number of subjects. ^ 

The decision was made to use two other criteria. (1) addition to R^, the, multiple correlation coefficient, and (2) 
"beta weight" times the corresponding zero-order correlation or point biserial (cf.. Draper and Smith, 1966). 
These procedures yielded more or less the same results. 

The outcomes were grouped (using subjective judgement) into subcategories or "clusters". If performance on 
outcomes can be predicted with reasonably high (say .3 and above) then one would expect that summing 
over outcomes within a "part ".would give even better predictability on the sub-categories. Due to the practical 
constraints regarding test length, a few sub-categories are estimated by only one item. 
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CHAPTER VIII / 



IMPLiCATIONS 

Introduction - ^ 

From the beginnmg, the Texas Career Education Measurement Series project has been visualized and con- 
ducted as a developmental effort. The building of objective-based measure for this project has included many 
types of procedures that either have been developed by others in the recent past or have been designed for 
this project. Some of the steps taken have folloyved the precedents for test development while other 
procedures have not followed the traditional mode. The purpose of this chapter is to assist those who are 
either contemplating or conducting test development efforts similar to this one by discussing some of \\^e im- 
plications for test development. 



Implementation of the Study 

Basing a measurement system on learner outcomes that have been developed from the perceptions of 
students, educators, and those outside of the field of education brings credibility to the development of an ob- 
jective-based test. There is evidence of less difficulty in obtaining assistance from schools. Early planning with 
schools is still necessary to assure timely field tests and item tryouts. 

Development of test instruments should not be undertaken in an objective-based system until the objectives 
are organized and written m appropriate form. Specification of t|ie behavior domains to be measured are a 
prerequisite to selection/development of items to measure those behaviors. 

Planning of the system for reporting results from the measurement instruments should begin with the initial 
development procedures. The reporting of results should become an important guide to the types of items 
developed. If the "how to report" frame of reference is ignored, one result can be that after items are written, it 
is discovered that the results cannot be reported in a useful manner. 



Item Development 

The development of items for an area such as career education which does not have an organized group of 
professionals who represent that discipline requires special attention in the item development phase. For 
example, 

• Item writing is particularly difficult — even for professional itern writers. 

• If local school peFsonnel are to be involved in item development, sufficient preparation for the task must 
be provided. 

Contributions from local school personnel can be obtained more effectively if item writing is conducted away 
from their regular duties. Time should be set asiole for them to work without conflict with their daily routine. 

In writing items for objective-based instruments, there should be a large number of items written i^n order to 
have sufficient coverage of objectives in the final instruments. Although item attrition for objective-based 
measur&STRay occur for different reasons than for norm-referenced tests, one should expect to reject 30% to 
50% of the items during the development and review processes. 

Sensitivity-to-instruction is an important concept for objective-based measures. A study of this type should be 
conducted after items have been validated for a given set of objectives in order to avoid interaction of two 
dependent variables — quality of instruction versus item validity. ^ * ' \ 



Item Review and Revision \, . ) 

Student review of items is very productive. If the students perceive that their input is important and will be 
used, they will furnish useful information about items. Items should be discussed with a small group of 
students (3-5) of the appropnaVfirage. A student sampling plan should be devised to ensure that each item will 
be reviewed by students^of each ethnic, sex, geographical, etc. sub-population. 

Continued revision of bad items soon becomes inefficient. If an item is unacceptable after two revisions, that 
item should be discarded and a new one deyeloped for the objective. ' • 

37 , 



Analysis of Data 

Significant advances have been made in the kinds of statistical analyses that are available for item and test 
construction m an objective-based measurement system. Further testing of these procedures will provide 
evidence of their usefulness for other test developers. The procedures presented in Chapt;^r IV are primarily 
useful for items of a multiple choice format and do provide acfd4ional information for decision-making about 
ttems. However, as the amount and types of information about items increase, additional attention must be 
given to the decision model for item acceptance". The relative weight to be given the results from two or 
more statistical procedures requires additional investigation. ^ 

4 

V 

General Implications ^ ' 

There is evidence from this project that a state department of education, a regionally-based project, and a 
commercial contractor can function together to develop new measurement instruments. Although special at- 
tention must be given to communication between the three organizations, serendipities of the following type 
can result: 

• A cadre of people at the regional level and the state level can obtain experience in test development. 
This expertise is beneficial for future development and revision of objective-based measures. 

• A commercial contractor can gam in knowledge of local, regional, and state educational policies and 
procedures. In addition, a large number of students and teachers can be involved in item tryout and 

• revision'procedures at a reduced cost to the contractor. 

• An increased level of awareness is developed throughout the schools and regions from participating in 
the development process. 

• Positive results are obtained from the involvement of personnel from several areas of specialization — 
special education, vocational education, curriculum, guidance, measurement, etc. 

When procedures are designed for local school participation in a developmental project, management plans 
must take into consideration the local school' calendar in order to provide sufficient time to schedule project 
activities around school holidays. 

The procedures used to bevelop this measurement system imply that the career education tests are now in 
their "first version". Objective-based measurement must be in a continuous state of refinement to retain 
relevancy to priority objectives. Future administrations of the 16 category tests and the survey test will provide 
additional student data upon which the system will be tested and refined. 
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Stutlent Review of • 
Career Education Items 



Item # . 
Cacnpus \ 
District . 



APPENDIX A 

Recorder 



M 
F 



B .. 




A 















Sections I and III are to be completed for all items m the' package. In Section 11, comple^erSnly thal>port(pa dp^ 
propnate to the format (multiple-choice, open-ended, etc.) of each particular item of me package.' 



SECTION I 



1. Relationship to objective> * Does the item get at the objective? 
more direct? If yes, how? ' ' • 



•I 



Cpu)d the rei'ati'oQship be*^ made 




Is the response (olhe itehijijjely to reflect what the student considers to,^?e the truth — or 
e'm lead the student toward giving an "expected" or "socially acceptable*' response? 




v3. Bias/OTfensiveness: *ls there anything offensive about the itern? 



If yes, what? 



* *Mlght the item be ur^air^o students of a particular race or se^c? 



How? 



4. Understandability: * Was there any trouble understanding the item or the directions? 
Yes No If yes, what caused the troublg^ 



5.- Appropriateness. Underline the phrase which be^\ describes how the students felt about the content of 
the item, too Mickey Mouse, too advanced, unr-elated .to student interests, dated, interesting and ap- 
propriate, other (speci-fy). ' * - . . 



Division of Program Planning and Needs Assessment 
Texas- Education Agency 
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SECTiorr ii 



(complete only the entry compaiible with the item being reviewed) 

1. Multiple-choice item with one or more choices besignated as "correct": i 

Do you agree that»the "correct" response(s) is (are) indeed correct? If no, why? 



Are some of the other responses defensible as being correct? If yes, which ones? 



Do you think any smart student could, regardless of whether he 'had mastered the objective, be able to 
ellgiina[te s6me of the response choices? _= If yes, which ones? , 



2. Multiple-choice item with no response designated as "correct": 

" Are there enough response choices that each student could express his feeling? If no, which 

' choices should be added*? .^^ 



3. Matching item: * '* , ! 

Do you feel that some of the matching pairs would fall to give .any information as to whether the student 
' had m^astered the objective? If so, which pairs? 



r 



4. Checklist: 



Would the person who is supposed to complete the checklist be able to do so without an excessive amount 
of effort? ^ . * ^ ' 

• 5. Opien-ended item:- 

Is the scoring guide clear? Are the responses to the item likely to provide the information 

; sought? ^ . * 



6. Individually administerexj it^msl ' ..^ 

Do you see any way that 2 or 3 group administered items (such as multiple-choice or matching). could get 
' -the same information? _ ^ * ' " ' ^ - 
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SECTION til 



O 



Comments concerning item attributes not mentioned above: 



General evaluation of Item potential - Excellent Good 

Comments: ' 



Fair 



Poor 



Suggestions for revisions: (where possible, enter onto item) 



Does.this item need additional review? 



Why? 
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APPEN.DIXB ... 

Teacher/Counselor Review / ; - Recorder . — Region _ 

of Career Education Items * * - 

Reviewers: Speci;alty: 

Item # Date : . — ^ ~ ^ ~ 



Sections I and 11^ are to be completed for all items in the package. In Section II, complete only that portion ap- 
propriate to the format (multiple-choice, open-ended, etc.) of each particular item of the package. 

^ . . SECTION I _ . _ 

1. Refationship to objective. *ls the relationship between the objective and what is measured by the Item 
' accepta{)ly close? YES ^ NO ' ' How could the item be changed so as to bring about a 
closer relationship between the objective arid the Item? . * . 



2. Credibility. Is the item likely to obtain a, true picture of Ihe student's knowledge,. feelings, or pl'ans (as 
distinguished from an "expected" or "soclairy accQptable'* response)? If no, why? 



3. Bias/Of fensiveness. *ls the item biased against or^ likely to be offensive to students *of a partteular race, 
sex, geographic Jocatlon, size and/or type, of community, or socio-economic sUtus? 

YES NO If yes, indicate the nature of the difficulty and, if possible, how the'biafs'or 

offensiveness might be reduced. * *• / . - ' 

• • * / 

4. Understandability. Which words, if any, would be likely to cause difficulty* among students at the sixth 
grade reading level? 

Is the sentence structure easy to follow? 



♦Wo^ld the it^'and its directions i)e^u,nderstaindable by 90% of *8th grade Texas students? 
YES No How could the ifem or its directions be improved? ^ . 



5. Appropriate,nej3s: Is the item appropriate for grade level 8? J. 11 T'. If no, why? 



6.. Usefulrijsss. Does the item provide information useful in identifying, the students instructional 
needs? If no, could the item be changed to do so? — Ho^7 

o . : 
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SECTION II 



(identify and complete only the emry appropriate for the item being reviewed) 

1. Muitlple-choice item with one or more choife^'designated as "correct": 

. Is there any quarrel that the response choice(s) designated as "correct"^aca-raore correct or desirable 
than the response choices not designated as "correct"? If ye§, Explain. 



Are any of the response choices so vyeak that a student who lacks the knowledge (or the desired attitude)^ 
but IS test-wise" enough to use the process of elimination, can guess the correct response at an above 
chance level? If yes» how could the "weak" response choices be strengthened? 



2. Multiple-choice item with no response designated as "correct": 

Do the response choices provide wide enough coverage to enable the student to give a reasonably ac- 
curate expression of his attitude or plan? ^ If no, v/hat should be added or changed? 



3. Matching item: ^ . 

Are any of the matching pairs "weak", i.e., fail to proyide information as tqthe student's master of the ob- 
jective? If so, which pairs? . , ^ , ' • 



Chegklist: t . - ^ . . 

* Would teachers (or students, as appropriatei find the checklisHeasible of 

• completion? YES- ^ 'NO , : — , - ' / , 

f scoring? YES 'NO ^ ' ' ' 



5. dpen-ended item: 

' Is the scoring guide- cle^ar? 



*Would scoring of the item by teachers be feasible? YES 
Will responses toihe item provide the information sought? . 



Could another type of item be used to gain similar information? 



NO 



_lf rro, why not? 



If no, why not? , 
^ llf so, how? 



6. • Individually administered items: ' \ ^ 

♦Would scoring of the item by teachqrjs be feasible? YES NO CouTd 2 of 3 group ad- 
ministered items (such as multiple choice or matching) get the same information? _ If yes. 

• how? ^ . ' 
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SECTIONIII 

Comments concerning item attributes not mentioned above: 



General evaluation of item potential - Excellent 
Comments: 



Good 



Fair 



Poor 



Suggestions for revisions: (where possible, enter onto item) * 



Does this item need additional review? 



Why? 
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APPENDIX C 
SchooJ Districts, Which Participated 



•School Districts 




Abilene ISD 
Alamo Heights ISD 
Aldine ISD 
Aiedo ISD 
Alief ISD 
Amarillo ISD 
Anthony ISD 
Apple^Springs ISD 
Arlington ISO , 
Austm ISD 



X 
X 
X 
X 



X 
X 



Beaumont ISD - 
Boerne County Line ISD 
Brazosport ISD 
Breckenridge ISp 
Bryan ISD 
Burleson ISD - * 



Calhoun County, IS^ "* 
Carroff ISD 

Carrollton-Farmers Branch ISD 
Carrizo Springs ISD 
Castleberry ISD ' » ' 
Chapel Kill- ISD "7- 



V X 
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APPENDIX C 
School Districts Which Participated 



School Districts 




Cleburne ISD 
Clyde ISD 
Cottinsville ISD 
Corpus Christi ISD. 
Cotulla ISD 

Crockett County Cons. ISD 
Crystal City ISD 
Cypress-Fairbanks ISD 

Dallas ISD 
Dayton ISD 
Denison ISD 
Denton ISD 
Dimmit ISD 
Donna ISD 
Duncanville ISD 

Eagle Pass ISD 
Eanes ISD 
Ector ISD 
Ector County ISD , 
EdgewoOd ISD 
• Edinburg ISD 
Edna ISD 



X 
X 



X 
X 
X 
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APPENDIX C 
School Districts Which Participated 



School Districts 





El Paso ISCX * 










X 




Everman ISD 










X 




Flatonia ISD 




X 




> 




» 


Forney ISD 


• 






X C 






Fort Stockton ISD 










X 




Fort Worth ISD 


X 


X 


X 




X 




Fredericksburg^D 




X 








• 


Galena ParkJSD\> ^ 
* .Galveston ISd/ ^ 










X 
X 


• 


Garland ISD 


4 




X 




X 




Gatesville LSD 








• 


X 




Gid.dings ISD 


X 












Goose Creek ISD 


X 


X 










Granbury ISD^ 








X 






Greenville ISD • 




i 






X 




Gregory-Portland ISD > 










X 




Groesbeck>tSD, 










X . 




Hamshire-Fannett ISD , • 










X 




Harlandale ISD 










X 




* » 

Hearne ISD 










X ' 




Hidalgo ISD 


•43 4 


8 • • 






X 
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APPENDIX C 
School Districts Which Participated 




Highland ISD 
Houston ISD 
Hughe^Springs ISD 
Harst-Eu less-Bedford ISD 



Irving ISD 

Joshua ISD 

Kerrville ISD / 
Kendale ISD 
Kilgore ISD 
Killeen ISD 
Ktngsville ISD 
Klein ISD 



Lackland ISD ' ♦ 

Lake Dallas ISD' 

Lampasas ISD 

La Porte ISD 

Lewisville ISD ' 

Little Cypress-Mauriceville ISD 

Lockney ISD 

Lorenzo ISD _ 
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APPENDIX C 
School Districts Which Participated 



School Districts 





5? 

















r 

m 










X 




IVlUrMlc^n lOU 






v/ 

A 










V 

A > 


• 




V 

A 






Midland ISD 










X 




Mineral Wells ISD ' ^ 










X 




IVIIoolUn lOU 






V 

A 








f- ■ 

iviooQy lOU 










X 




rM6a6TionQ lou 










X V 




rview Dosion lou ;» 










X 




INUrUi Cool lOU 


V 
A 








X 




North Forest ISD 




X 






Y 

/> 




Northside ISD 










X 




Palmer ISD 




X 


X 








Pearsall ISD 


• 


X 










Pflugerville ISD ^ . ^ 


X 


X 










Pharr-San Juan-Alamo ISD 










X 




Piano fSD 








X ^ 


x" 




PostMSD 

> 










X 




. Pottsboro'RISD 




X 










ptikjIanlSD , , ■ - , 




X 


X 
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APPENDIX C 
Schcxjl Districts Which Participated « 



School Districts 





.Red Oak ISD 






X 


• 

X 




• 




Richardson ISD 












X 




Rio Hondo ISD 












x 




iRobinsbn ISD 












X 

< 




San Antonio ISD 




X 


X 










Santa Rosa ISD 




0 








X 




Sherman TSD 












X 




South San Antonio ISD 












X 




Spring Branch ISD 






X 






X 




Stamford ISD 












X 




Taylor ISD 












X 




Temple ISD ^ ' 












Y ^ 
A 




Terrell ISD 










Y 
A 






Texas City'lSD 












X 




Tyj^r ISD 












/ * X 




UnitQd ISD 










0 


« 

X 

< 




Van Alstyne ISCr 






X ' 










Victoria ISD 












X 


I, ' 


Waco ISO 












X ' 
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APPENDIX C 
School Districts "Which Participated 



1 



School Districts 





Water VaHey ISD . 








/ ^ 


X 




Weatherford ISD 








X ^ 




• 




X 








X 




Wharton ISD C ' * ' 








/ 


X 




Whitewright LSD 

Wichita Falls ISD | 
Willis ISD 


\ 


X 




/ 


X • 
X 


• 


Wilmer-Hutchins IgD 




X 


X ■ 




X 




Wylie ISD- 




X 


X 








YsletalSD 


f" 








X . 


* 




) 




; 


















» 






V 

47 


1 

52- 








» 

• 



ERIC 



APPENDIX D 

THE WLC/MRC INSTRUMENT ANALYSIS PROGRAM PACKAGE: 
INTERPRETIVE GUIDE 

The WLC/MRC instrument analysis package is a "generalized" computer program for analyzing items and in- 
struments, it goes beyond the standard item analysis, applying statistical tests of significance to determine 
whether or not (i) students are, as a group, guessing at the item, (ii) the foils are attracting uniformly, and (iii) 
an Item or instrument is culturally biased. Tn addition, traditional statistics are computed such as "p-values," 
foil , distributions, ppint biserials, and KR20 reliability coefficients. The package comprises two com- 
ponents. (1) an "Item analysis" whrch includes the cultural validity analysis, and (2) an "objective analysis," 
which includes mastery/non^mastery statistics as well as KR2o's. . ^ * 



Description of the "Item Analysis'^Printout 

The printout for the "item analysis" includes the following statistics: 

1 . P-values 

The percent correctly answering each item is presented. 

2. Foil distribution 

The percent answering each wrong response as well as omits, double marks, and "invalids" is pre- 
sented. ^ 

3. Z-test for "chance level of functioning (guessing)" \ ^ 

The hypothesis Hq. P =,1/r is tested, where'r = number of responses, against the one-sided alter- 
natives Hi. p < 1/r and H2: p > 1/r, respectively, using a large sample (approximate) test. The 
hypotheses Hq, H-], and H2 correspond" to "guessing," "below chance," and "above 'chance," respec- 
tively. If instrudtion has been given to the objectives tested, acceptance of H2^means that there, is 
evidence that the item is appropriate for the grade level tested. (If instruction has not been given, this 
test IS still informative, but acceptance of H2 should po/ be considered a requirement for inclusion of 
the item in the instrument.) " ' ' 

4. Chi-square test for uniform foil response distribution 

A chi-square test of the hypothesis that the tojis (incorfect responses) are uniformly attractive is con- 
ducted and relevant statistics are printed. ^ 
' 5. Internal ponsistency \ - x 

A point biserial yields information about the internal consistency of the test, i.e., the extent to which 
* "the Items measure the same thing." Moreover, tha "^maximum" point biserial (corresponding to the 

case p = 1/2) is calculated. This statistic indicatesuhe extent of the influence of the p-value on the 
point biserial. * \ / 

6. Breakdown by "cultural" groups 

The above statistics are computed for each cultural grmip, for each jcultural variable (e,g., ethnic 
background, sex, SES, etc.). * ■ ' 

7. CulturaKvalidity analysis 

Statistics for testing a.nd measuring the cultural validity of items, objectives, and the total test are 
computed. The approach is that described in the SCORE technical report "Cultural Validity of Items 
and Tests. A New Approach" by James R. Veale and Dale I. Forerrjan. The conditional "foil" response 
distributions are investigated using chi-square and other procedures for measuring the degree of 
heterogeneity of these distributions across cultural groups. 

In all of the above procedd§;es which involve significance tpsts, significance levels (i.e., the probability of 
"more extreme" values under^he null hypothesis) are computed if they are less than 0.10. This enables the 
user to specify his own "critical" lefVel of significance (e.g., .01, .05, or .10). A sample item analysis is given in 
Table 1 on page 4. ^ . 
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decision Model fo|*'Uem Analysis"^ ^ 

A "global" anafysi&^f the p/intqut data is Suggested for determining the viability of the items. A ^'decision 
moder* (Table '2]£fs presented tq indicate one possible set pf criteria. (Notation. = ,6hi-square 
statistic, V = Crfiijier's statistic which measures degree of association or heterogeneity, T95 = lower 95 per- 
cent confidence Jpiit for the Goodman-Kruskal T statistic, 1*95 = lovyer 95 percent confidence limit for Good- 
man-Kruskal L*^tatistic, "PT-BISER" = point biserial..correlation coefficient. "Max PT-BISER" = ^ maximum" 
point biserial, ^= statistic for testing chance level of functioning (guessing). 

The specific ^merical cutoffs for the "rejection/' "'questionable," and '"acceptance" levels are only rough 
guidelines fo^nalysis. We do not favor a ''weighting" system for evaluating items (e,g., assigning weights to 
the four Xypetoi analyses and numerical ratings to the three levels), since this would imply a further "ab- 
straction" o^the observed data beyond the statistical analysis. Moreover, it involves a high degree of ar- 
bitrarinessiPowever, such a system may be of use tn special situations^ 

Objecti^^haiysis 

The prinrout for the "objective analysis" mcludes the following: 

1. Percent mattering objectives 

The percent of respondents "^mastering" each objective is printed. This is computed by determining 
/the number of respondents who correctly answered a sufficiently high number of iterns in each ob- 
jective. (For example, if there are five items and a 70 percent mastery level is used, a student must 
/ answ6r at least four items to be classified as a "master.") 

2. ^^ Upper confidence limits for percent mastering 

If^ Upper 95 and 99 percent confidence limits are computed using standard statistical procedures. This 
^ J yields tjie largest probable values of the percent mastering objectives for the population based on the 

Jf sample data. , » , 

Mt KR2o reliabili!y coefficients 

Jf^ (A KR20 reliability coefficient is corriputed for eactt instrument. * ^ 

Jultural validity analysis may be conducted at the (i) item, (ii) objective, and (iii) total test level*. Similarly, 
Dbint biserial, percent mastering, and KR20 statistic may be computed with respect to objectives (two 
archical levels) and total test. " . ^ 
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APPENDIX E 

STUDENT INFORMATION SHEET 

On your Answer Sheet there is a section labeled "STUDENT INFORMATION/' columns 3^rough 9. Each 
column of numbered 6yals corresponds to a question on this page, read each question, 3 thr^gh 9, arid darken 
the ovaJ that matches the number of your response in the approprjate column on your Answer Sheet 

3, Tb which group do you belong? 

1. Mexican-American ' " * . 

2. Black ' - ' , 

3. Anglo v * , . . . ' 

4. American Indian ^ \ 

5. .Oriental 

6. Other * • ' 

4. Which language is spoken, in your home? i 

1, Spanish • ^ 

2, German ' * • ^ 

3, Czech • ' 

'4. French , 

5. Chinese 

'6. Italian 

7. Polish ;U • - 

8. English ^ , ' 
. 9. Other 

5. Outside of school, how long do you usually watch TV on a school day? ' 

1. None r ^ * , 

2. 1 or 2 hours 

3. 3 or 4 hours 
c 4. 5 or 6 hours 

5. More than 6 hours*' ^ 

6, How many books do you have in your hoi 

1 . Few 

2. Many 




Do you have encyclopedias in your home? 

1, Yes 
2/ iNio 



8. Does your family receive & daily newspaper? . 

1. Yes 
Z No 

9. Does your family receive-magazines through the'^mail? 
1. Yes 

2. , No < , ; ' 
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CULTURAL VALIDITY OF ITEMS AND TESTS: A NEW APPROACH 



James R. Veale and Dale I. Foreman 
Technical Report No. 1 



Abstract* 



The question of cultural bias m test instruments is a critical one for test developrripnt. Most of.the procedjjres 
for detecting cultural bias wNch have been heretofore advanced assuipe that either (i) an unl?lased external 
criterion for ability is available, or (ii).the total score on the tfe^ a ^^asonably gdod approximation of* the 
student's ability. " • ' - * 

The approach taken m this paper is based on the variation apnong conditional foil response distributions for the 
various cultural groups in the population teste'd. It does not involve measures of ability and thus. does not 
require either of the above assumptions. Both large sample and smalljsa/nple procecjures ara presented. 



♦Reprints are available from WLC/MRC, Iowa City, Jovya 
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. . • APPENDIX G . ' . • 

n.. . G^UIDETd THE STATISTICS USK) IN THE CULTURAL VALlDrT^^^ ' 

Thi9 appendix includes a brief discussron.of the Statistics used In the c^^ ^ : 

1. CW-tqu«r# ttatlftlct. . . /. . * • • 

A chi-squa/e statistfcjs corpputed *for each item td test the statisfical significance of cultural 
heterogeneity of foil resT3onses„.i'.e., to test the ^hypothesis that cultural groups and foi| response' are In- 
dependent. The usual formula w^s applied to the^contingericy table consisting of foil responses (column) 
for the various cultural groups (rows). Significancejevels were computed and (when they were less than 
^ 0.10) prlQted. V ./ ' . . . . . ' 

2. <Cram<r'$ V statistic; - • ' \. 
Crammer's V is» a n)Qasure of the degree of -culturat variation in foil responses, defined, as foHovv^:. 




N mmU- I, f- !} 



where JC^is the aforementioned chi-squaret statistic, N is the number of incorrect (fqiJ) responses, g is the 
number of cultural groups, and f is the number of Joils <plas "double marks," if any). The.V statistic ranges 
frofT] zero to: unity, with zero corresponding to no cultural variation and unity corresponding to extreme 
cultural variation. \. --r-^ — ' - 

The Goodmen-Kruakal measures of het<»rofleneity. . ' . ' 

Goodmap and Kruskal (1354) d^veloped several measures oj asscicialion whicb h^ve a probabHrstIc in; 
.terpretatlon. Two if thes6 statistic^ denoted T end 11, are defined as follows^ • . ^ ^ - 



go :^ , - ^ ,Q . 



where Oab, is the observed numBer erf responses to foJI.b in cultorargroup a, 0^. i$ the total number of foil 
respon3es in,cultural grogp a, 0 5 is the total number of responses to foil b. 0am 's \\\ejnaxtmum number 
of foil responses, in cultural group a, O ^ is tjie maximum total number of fdij responses '(after summipg. 
over cultural groups), and N Is the tptaf number of foil responses,; * ' » 

The above statistics, and the slightly modified statistics 'denoted T* and L*, have operational in^ning 
whatever the sample size (N), unlike the chi-square (which require^ large sampJes). They measure the 
proportion of erifors in predicting the foil responses of randorf^ly cho^eri indrvi'du^als that can be eliminated 
by incorporating knowledge of the individua^i cultural ^roup. Tf^ey allVange from zero to unity, With ^ero 
corresponding \d{no gain in pr^^jctive utility y/itb knowled(Je of CulturaLgroups (no culturaf variation) and 
unity corresponding to perfecf'predictive utility with knoyvledge of cultur'al group (extreme cultural 
variation). •* . * ^ ^ ^ * ^ . ' . • , . 

ower-95 percent confidence limltl' for T and L*, ' 

ower 95 percent confidence limits for '(tM true yafues ofJ'T arid vvere also computed. This takes into 
count th^ sampling error, which is imjDortaint si/ice we are sampling approximately SOO^students (per in- 
•-umeiit), rather than testing the entire population bf Texas ^uden^ 



Degrde of cultural variation. 

Protessiortal judgment was employed to rate the degree of cultural variability exhibited by. the item data, 

jjsing a//' of the statistics discussed above. The rating scale was: 

t = very high variability, ~ ^ 

2 = high Vj3riability, and • ^ ' 

^3 = moderate variability. ' ^ 

For% rfiore detailed discussion of tbe approach and techniques used for measuring cultural variation, see 
Vealfe and Foreman (-1975). 
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APPENDIX H 

» 

ANALYSIS FOR ITEMS EXHIBITING CULTURAL VARIATION 



V 



This appendix includes a content anajysis of items manifesting some degree of cultural variation according to 
the statistics described m Chapter 4 and in Appendix 6, Tables listing the items having cultural variati<Jhs and 
probable cause{si for, the variation(s) are displayed. (For example, an item may be culturally biased as reflec- 
ted by the variation in foil responses across groups due to a factor inherent in the respondent s cultpral 
background which results in a distortion of the p values for the groups.) It should be understood that these 
♦analyses consist of data based hypotheses of one test development specialist. 
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Outcome/ 






"Diagnostic 






No Clear 


Jtem Number 


\ 


Bias(Type) 


Bad Foii(#) 


Bad Format 


Evidence 


' 0103/3{ ^ 
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X(A-,D) 






01 03/3 4 
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■ X (A,D) ^ 


> 




0104/11 




07 




X 
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10 




X 








0104/12* 




07 


X(E) 










0104/13 




10 










X (Easy) 


0104/t5 




07 




X 








0104/16 




07 


X (S,E) 










'0105/17A1 




07 








X 




0105/17Aj 




10 


X(E)^ 






X 




0105/17Cl 




07 


X(^ 






^x 




01 05/1 7E 


* 


07 






t 


X 




01 05/1 7D 




07 


X(E) 






X 




0105/17Bi 




07 


X(E) r 






X 




0107/7 * 




10 


X(E) 




• X (B)- 




1 



for the above- table and other tables in this appendix. 
* = item is included-in the content analysis 
E = ethnic variable 



S = sex variable 



BAD FOIL 

Booklet 11 /ftem 3: 

Which ONE of the following is the BEST reason why'people need to be satisfied with their jobs? 

(A) If they make the.effort, people can learn to get along on a job. ^ 

(B) Satisfied people^do better work ah^ are happier. 

(0) Satisfied people do not have to try very hard to better themselves. 

(D) People should seek'jojD satisfaction from their family and friends. / ** 

Foils "(A)" and "(D)" do not relate to the question that was asked. Any answer to a question should certainly 
answer the question (only wrongly if it is a foil.) Both "(A)" and "(D)" need to be revised to answer the question 
"Why do people need to be satisfied?*' or replaced with other foils. - 
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BAD FOIL + CULTURAL BIAS x 

• % 

Booklet 11, Item 7: ^ • . , ' * 

. As a pharmacist working for a cham drugstore, your goal is to operate your own business. You realize 
that a new pharmacy probably would be successful if opened in a recently developed area of town. You 
would Uke to quit your job and establish your own 6uslnes,s, but you do not have enough money to do so. 
so. ^ , , 

Which oj the following actions would BEST solve your problem and help you reach your goal? 

(A) forget about operating your own business , . . 

(B) sell your horhe and car to raise the money 

(C) go into partnership with someone with money to invest 

(D) read all the latest magazines on drugstore operation 

Foil IS pot attractive to any of , the students. It is iogical that nearly everyone is sufficiently security orierir 
ted (conservative) to resist giving up anything that they already pos^^ess in order to engage in a speculative 
venture. That is exactly what is suggested in foil "sell your hbgse afid.car to raise the money.'* 

Another problem with the foil in r-elation to the item is that nowhere in tha item doe6 it gay "you*' owrt a house 
and car. Most kids would not consider foil "(B)" since they cannot rjBlate to such ownership. 

Mexican-Americans are overly attracted to "(D) It is. possible that through their background (poor reading) 
and^thelr view of the background of those who are successful (and can read) they believe reading proficiency 
will yield success.'' , , 

CULTURAL BIAS . \ - 

Booklet 11,- Item 12: - " * • 

Grace wants a job where she does riot have to deal with many, strangers, ' ^ » ^ 

■ Which career do you feel would BEST match (face's goal? 

(A) receptionist ' . 

(B) bookkeeper - * 

(C) . public libcarian . * , ' 

(D) salesperson i ' • ' 

In this Item, there is a Mexican-Americap/black "interactipn" at grade 7 with foils **(C)" and V(D)". (Mexican- 
Amenpans were more attracted to "(C)", vvhile blacks were, more attracted to '*(□)". Unless the students were 
specifically taught the duties of these jobs, it is likely that the. responses woul^ be highly influenced'by either 
lack of experience or by some key word association. For example, tjie most difficult word, ^'receptionist," is 
chosen very, frequently. This very often happens when the students have little knowledge of concept. Moreover, 
It IS interesting to note that among the foil responses, "(C)" and "(D)" are proportionately more attractive with 
• minorities than with "others" (primarily angles). With specific education to these occjupations, tWi/ariation 
•may be eliminated. . ' ' . » . * ' - 

BAD FORMAT + DIAGNOSTIC 

Booklet 11, Item 17: * \ ' ' ^ 

At this ttme,;Which of the ipWomng do you .think is your career' direction? Darken (A) on ybur Answer 
Sheet for the' OA/E* direction which you have chosen. Darken (B) for the others. * ' . 

enter ^ tradd or technical school " 
prepare for innmediatfe emplpyment . .. ^ . * 

enter college ■ . 

do not woriTata job * ' ■ . ' 

* some other direction 

This Item has no correct answer. It is asking a student to select a career direction.. The data can then be used 
as census data to help plan for counseling, etc.* ^ , ' ^ ^ 



(A) 


(B) 


a. 


(A) 


(B) 


b. 


(A) 


.(B) 


c. 


(A) 


•(B) 


d. 


(A) 


(B) 


e. 



Unfortunately, the "Yes/No" format was confusing to the Mexican-Americans and blacks. The data show that 
many minority students mark6d "Yes" to several of the career directions. They did not understand that only 
one "Yes*' should be marked. Thes^ data Tn thefr pFesent form are of little use. 

A better format would be to eliminate foil "(E)" and make this item a four choice multiple choice asking the 
student to ''Mark the OWE career directign ygu chooser^^' 



BOOKLET 12 



Outcome/ 

1 tp m Nl 1 J m Ko r 




DlaS\ 1 yp67 


uiagnosiic 


baa ro\\\ff) 


• 

Bad Format 


No Clear 
. Evidence 


^010275 * 
01,02/24 
0102/24' 
0t12/8 * „ 
0112/10 . 
0112/10 
•0112/11 
0112/11 
0202/16 ^ 
0202/38* 
0202/21 
. d202/22 . 


_Da 
.be- 
ll 
11 

08 

11 

08 
11 
11 
08 
08 
08 


^ X(E) 

X(S) • . 

X(E) 
X(E) ■ 


X 



■ X 
X 

X 


. X(AII 

Correct) 

X (A,B) 

X(B) 

X(B) 

X(C) 

X(C) 


X 


« 

t 

X 



.BAD FORMAT . 

Booklet 12, ltem'5: 

Read the follo\^^ing paragraph and answer questions 5 and 6 on your Answer Sheet. 

Carol, who Is a volunteer vj^orker at Garteral Hospital, Is graduating from high school. She hopes to make 
" purs<n^ her career.' The hospital has offered her a job as a nurse's aide. Carol Is trying to decide 
whether to take the job or tp enter heiOocal community college to become a Licensed Vocational Nurse. 
^ 5* If Carol decides to take .the job, which 0N£ of the following rplght be a result of that decision? . 

(A) She might get to be a doctor. • * 

(B) Sha may never become a nurse. • * * 

(C) She will always vyork as a nurse's aide. 

(D) She would still be to to school. ' . , . ■ . 

Xhe stem'of this item is statedin such a way that all'answers are correct The question asks "which ONE of the 
following might be Any of thef^rrigwers might be a result of the decision. It should iJe restated 'in such a. 
way that the student will select the mpsf likely result^ of the de^^isipn. and .then giake sure th^re \^ only ONE . 
most likely decision among the answers. ' ' ''^^ ' i 

BAD FOIL i • ' ' ' 

Booklet 1?, Item 8: • ' ' \ ' . - , 

< ■ " ' , 

Joe "never did Well in school. Five years agd, he dropped 'out and began doinj odd job3 around the 
neighborhood. He lived with his folks and paid part of the living expenses with .his earnings. 

' * A year ago, Joe and LaWanda married. Now they and their baby live with his folks, but ^hey would, like 
very, much to be able to move toe- place of their own. Joe worr-ies a lot about taking care of his family. 
_ ^ Joe ke^ps trying to get a steady job. He Wants to get training. He neecis a high school diploma. His 
friends tell him tha t he is Crazy to think that things will ever get- better. 



Given the factors fnfluencing Joes life-style*,^ whicti ONE of the following statements BEST describes 
Joe *s chances of meeting his needs and wants. ' . , 

(A) Because of Joe's educational level, he will not have difficulty meeting his needs and wants. 
(6) Because of Joe's marriage, he. will meet all his needs and wants, 

fC) Because of Joe's educational level and family responsibilities, he will have a difficult time meeting' 

his needs and wants. . , - / i 

, » - • , * , 

Foils "(A)" and (5)" are.too easily eliminated. One problem is that because two foils are parallel (negatives), 
'*(A)" and "(C)", the student can automatically elt^|jate "(B)". This is a common problem in test construction. 

Secondly, it is obvious, that Joe s low educatronji^^l is going to liifiit his success in meeting his needs and 
wants. This leaves' "(C)"^as the only choice. viT 



CULTURAL BIAS -h BAD FORMAT . . 

*. 

Booklet 12, Item 18: - 

« 

Which ONE of the following would NOT be a good way toJearn about the supply and demand of a par- 
ticular occupation? 

(A) going to the local employment office ' " * 

(6) talking to personnel directors ■ 

(C) talking to those currently employed in the field 

(D) determining the number of workers in your local community. 

There is evidence that the blacks and Mexican-Americans are NOT reading the negative stem as a negative* 
Both are going to foils, each different, that would be, in their mind, BEST places to learn about supply and 
demand of an occupation. The concept of supply and demand may be too difficult for eighth graders.' 



BOOKLET 21 



Outcome/ 












No Clear 


Item Number 


\' 


Bias(Type) 


Diagnostic 


Bad Foil(#) 


Bad Format 


Evidence 


0201/18 * 


10 




X 








0205/13 


Id 




X 


X(B) 






0205/16; 


07 




X 






0205/16i* 


10 




X 




* 




0207/10 


97. 






XID) 






0207/17 


07- 




■ X 






0210/5 


07 


X(E)' 










0210/6 


07 




X 








0210/7 


07 




X 









DIAGNOSTIC 

Booklet 2r, Item 16: 



, Which Of the sources below would give you the BEST information (job description, locajion within the 
United Sates, salary, requirements) about all types of employment? * 

(A) a local employment agency 

(B) "Help Wanted" section of newspaper 

/ (0) Occupational Outlook Handbook • . ' 

* (D) state employment office 

The incorrect resppnses to this question should lead into instructional strategies which will clarify the typical 
types (S\ information that can be obtained fronri each source. One potential source of problems at present may 
be the lack of knowledge of many about the existence of \he Occupational Outlodk Handbook. Also, most 
people are aware of the ''Help Wanted" section of the newspaper and state employment offices. This could 
cause differential attraction to "(B)" and '*(□)!' due to their common occurrence. 
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DIAGNOSTIC 

Booklet 21.ltem 18: ^ 
Part 1 ^ . 

On the line below, write the occupation title you chose from the Occupation List on Page 3. 



Think about the occupation you chose. Hav^ you ever talked to someone who wprks in that field in order to get 
more information about .the field? If so, darken (Yes), otherwise darken (No). 

Part II ; / . ^ 

If you answered "Yes" what did he/3h^'tell you about his/her job that might be useful to you? 



Scoring Key: 

Mark "(A)" if the student indicates his/her career of interest, "Yes" for Part I, and at least one piece of 

useful information that the person told him/her about his/her job in Part II. 
Mark *'(B)" if the student indicates his/her career of interest and "Yes" for Part I only. 
Mark "(C)" otherwise. * 

Examples of Useful Infoi^mation: 

- types of skills and knowledge areas required 4 

- job outlook for the future 

- types of job characteristics relevant to the job V ' 

- salary expectations 

• types of employee benefits that probably exist 

- chances for advancement in the chosen career ^ 

This '4^ an open-ended item with a scoring guide. The differential response patterns for this type of item mean 
either the scoring guide is inappropriate, incomplete, or otherwise disfunctional or that the information is 
^diagnostic of different population deficiencies. In this case, the scoring guide is appropriate. The strong 
Mexicari-American affinity to "C" implies that fewer of the group have talked to someone who works in a field 
of their interest. 



BOOKLET 32a 



Outcome/ 












No Clear 


Item Numbar 




'Bias(Type) 


Diagnostic 


Bad FQil(#) 


" Bad Format 


Evidence 


0301/2 * 


11 




X 


X(D) 






0301/3 


08 






X(C,D) . 






* 0302/11 * 


08 


X(E) 




X (Stem.A) 






0302/1 2 B 


08' 




X 








0302/1 2 B ' 


11 




X 









\ 




DIAGNOSTIC AND BAD FOIL ; 

* " • * 

Booklet 32a. Item g: 

Which OA/E of the foNowing would you probably be required to write in on a job application form? 

(A) ' names and addresses of references ' ^ ' . ' • ^ 

(B) ' names of stores where you have charge accounts , ^ , • 

(C) names of your teachera^ ' ' 

(D) names of foreign countries in which you have traveled 

Only one application would include the question "What foreign countries have you traveled in?" That is a 
security clearance for a goyernme)^t )0b. The foil is very out of line with the other responses making it 
unattractive or unreasonable. Someroing like "names and addresses of allyour schools" would be better. 

The other foils give diagnostic information, such as an indication of where you would have to list your charge 
accounts. Each of these wrong responses could be used by the teacher to teach the student where their use 
would be appropriate. . ^ 

BAD FOIL + CULTURAL BIAS 

Booklet 32a. Item 11: ^ 

John IS 16 years old and will bfe interviewed for a part-time position as a machinist. The perspnal 
qualitj' his prpspective employer will think MOST important is > ■ \ 

(A) his previous years of work experience. 

(B) his high school grade average. 

(C) his appearance. ' ' 
, (D) his attitude. " ' 

There are several problems with this item. First, the question asks for a personal quality and the keyed answer 
;^(B)" C'His previous year of work experience") is not a personal quality. Further, foil "(C)" is not selected. This 
seems logical since it is also not a personal quality but a physical quality. Blacks selected foil *'(□)" heavily. 

BOOKLET 32b 



Outcome/ 
Item Number 


\ 


Bias(Type) 


Diagnostic 


Bad Foil(#) 


Bad Format 


No Cl6ar 
Evidence 


0307/1 
■ 0307/4 * 


11 

08 




X 
X 


X(B) 
X(D) 







BAD FOIL + DIAGNOSTIC ^ > ^ 

Booklet 32b, Uem 4: 

. Which ONE of the following situations indicates'^job success? 

(A) You\work 'for a company that has signed a new labor>contract jind has given ajl employees an eight 
percent raise\ 

(B) " You are asked to work ov^ftttfra <j>n Friday afternoons for the next two months. 

"(C) During a conference, you are asked for your advice on changing e/nployee work routines. 
(D) You are ^sked to proofread your letters before they are mailed out. 

Foil "(B)" seems on the surface to be a good foil. In o{her words, being asked to work overtime means the boss 
likes your work and therefore you ^re successful., This foil is not attractive to the students. 

'Otherwise, alt the foils provide diagnostic information for the teacher and students who select them Foil "(A)", 
for example, identifies the student who is unable to discriminate between a general increase and a personal 
raise for a good job done. 
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BOOKLET 32 c 



^Outcome/ 
Item Number 


\ 


Bias(Type) 


Diagnostic 


Bad *Foil(#) 


Bad Format 


* No Clear 
Evidence 


.0301/31 * 


08 




. X 









DIAGNOSTIC 

Booklet 32c, Item 31: 
Scoring Key: 

This is a summary score which ties together itemis' V30. 
. Mark "(A)" if all <30 categories are scored (A). 

< Mark "(B)" if all "(*)" categories are scored (A), and one ofNriore of the other categories are scored (B). 
Mark "(C)" if five to seven of 'the "(*)" categories are^sco^ed (A), and one^or more of the other 

categories are scored (B). 
Mark "(D)" jf less than five of the "(*)" categories are scored (A), and one or more of the other 

categories are scored (6). 

This itelm is an application blank that is scored according \q degree of correctness. For example, those who 
scored in category "(B)" have completed the necessary (i.e. critical) parts of.the application. Their response is 
sufficient to be aWe to obtain a job. 



The item is scored according to written criteria and is, therefore, diagnostic. Students mis: 
can be instructed to improve their subsequent responses. 




rts of the item 
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APPLICATION FOR EMPLOYMENT 



SOCIAL SECURITY NUMBER: 



7 « 



PERSONAL INFORMATION 



DATE: 



NAM£- 



LAST 



FIRST 



MIDDLE 

-^W — 



NAME PREFIX 
MR. rMRS. ^ MISS 



DR. 



-MS. 



PRESENT ADJDRESS 



r 



STREET 



CftY 



STATE 



ZIP 



Permanent address 



STRiET 



CITY 



STATE* 



PHON^ NO. 



. IF RELATBO TO ANYONE IN OUR EMPLOY. 
STATE NAME AND DEPARTMENT 



REFERRED BY 



EMPLOYMENT DESIRED 



POSITION 



DATE YOO 
CAN START 



SALARY 
DESIRED 



ARE YOU EMPLOYED^NQW? 



IFSOMAYWE INQUIRE 
S^F YOUR PRESENT EMPLOYER? 



EVER APPLIED TOn^^COMPAJ^BEFORE 



PLACE 



DATE 



EDUCATION 



NAME AND LOCATION OF SCHOOL 



YEARS 
ATTENDED 



♦ DATE ^ 
GRADUATED 



MAJOR COURSE OF STU 



5 j/y^S'' 



ELEMENTARY SCHOOL^ 




JUNIORHIGH OR MIDDLE SC»^OOL 




HIGH SCHOOL 



COLLEGE 



TRADE. BUSINESS OR 
CORRESPONDENCE SCHOOL 



WHAT FOREIGN LANGUAGES DO YOU SPEAK FLUENTLY? 



READ 



WRITE 



Activities (clubs, hobbies, interests, etc.) 



T 



SIDE ONE 



FORMEf) EMPLOYERS (TiSrStLOW LAST FOUR EMPLOYERS, STARTING WITH LAST ONE FI^ST.) 



DATE 
MONTH AND YE)^R 



NAME AND ADDRESS OF EMPLOYER 



SALARY 



POSITION 



RE 



\SON FOR LEAVfNG 



FROM 



TO 



FROM 



TO 



FROM, 



TO 



FROM 



TO 



REFERENCES. Gl VE 8EL0W THE NAMES Of THREE PERSONS NOT.RELATED TO YOU, WHOM YOU HAVE KNOWN AT LEAST ONE YEAR. 



NAME 


7 

^ ADDRESS 


BUSINESS' 


YEARS 
ACQUAINTED 


1 








2 ^ 








3 




' ■ i. ' 





IN CASE OF 
EMERGENCY NOTIFY 



NAME 



ADDRESS 



PHONE NO. 



iVuTHORiZE INVESTIGATION OF ALL STATEMENTS CONTAINED IN THIS APPLICATION. r^UNDERSTAND THAT MISREPRESENTATION OR 
^OMISSION OF FACTS CALLED FOR IS CAUSE FOR DISMISSAL FURTHER. I UNDERSTAND AND AGREE THAT MY EMPLOYMENT IS FOR NO 
"PEF^ITE PERIOD AND MAY. REGARDLESS OF THE DATE OF PAYMENf OF MY WAGES AND SALARY. BE TERMINATED AT ANY TIME 

WITHOUT ANY PREVIOUS NOTICE 



DATE 



SIGNATURE 



DO NOT WRITE BELOW THIS LINE 



.INTERVIEWED BY 



DATE 



REMARKS: 



HIRED 



DEPT. 

ASSIGNMENT 



POSITI 



REPORTING 
• DATE 



SALARY 
WAGES 



APPROVED: 1, 



GENE-RAL MANAGER } 
SIDE TWO 



ERIC 



EMPLOYMENT MANAGER 



DEPT. HEAD- 
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BOOKLET 41 



Oi ifpnnnp/ 






* Diaanostic 


Bad Foil(#) 


Bad Format 


No Clear 
Evidence 


0401/1 . t 

0404/1 


07. 












X 


10 












X 


■ 0401/3 ,* 


"07 


. -X (E) 


m 










0407/10 


07 




X 










0408/13 * 


07 




X 










' 0401 /4A-D 


07,10 










X 




0407/9A-D ^ 


07,t0 














0407/1 2A-F 


07,10 




X 






X 





NO CLEAR. EV^NpE 

Booklet'^^l, item 3: 



' Lem works 4n a supermarket as produce manager. He^upervises the.stock boys and sets a good exam- 
ple in his work. His work is always outstanding. Lem sometimes uncovers pricing errors which would 
cost the store a lot of money. The food in his department is always fresh. Lem is careful to insure that 
his customers are well satisfied. ^ % 

^ How would Lem's work likefy affect his status in the store? , - 

(A) Lem woufd probabiy be offered a, job by.another store. 

(B) Lem would be looked up to by his fellow employees as a good worker. \ 

(C) . 4-em would feel that he is better than everyone else. 

(D) Lem's boss might think that Lerrf is out target his job. 

m this Item, blacks tend to respond more to "(D)". It could "be interpreted that anyone who puts out extra effort 
IS out to get someone else s job. This could result in the selection of foil "(D)" by those who have that outlook. 

DIAGNOStlC * ^ ' ' ' ^ 

Booklet 41, Item 13: . ^ 

, Juan, a social worker, has completed, a case which required a great deal of timte and effort Select the 
ONE Statement which . indicates a' behavior that shows Juar> takes pride in his successful ac- 
complishment. 

* " - 

(A) .juan told a fellow worker how gooci he felt about the job. * ' 

(B) Juan left work early because the task was cfom^leted. 

(C) Juan bedded to apply for a new job that would pay more money and would not clemand so much 
time." ' ' . ■ 

.(D) Juan talked with Helen about a case on which she was working. 

Each of ihe incorrect responses indicate different results which might 'stem from an incorrect interpretation of 
the meaning of taking pride in one's accomplishments. For example, an individual may think that leaving early 
was an indication of pride. 

■« 

'Each incorrect response indicates a mind set that the student has which could be corrected with different in- 
structional approaches. This offers an ideal diagnostic setupN^ich can help combine testing results with in- 
struction for things such as grouping students for' instruction arra'Tfe^^T^^^o^^^o^tt^6^jnstructional program, 
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BOOKLET 42a . . - 


Oiitonmp/ 


\ ■ 










No Clear' 


Item' Number 


\ 


Bias(Type) 


Diagnostip 
— ^ 


BacJ PDil(#) 


Bad Format 


Evidence * 

r 


0403/8 


11 




. X 


~' — — 




- 


0403/1 OA I 


08 




• ■ X 






• 


040'3/ipC 1 


08 




X 






V 


0403/IOC/ 


11 




• ' X 


1 




t 


0403/1 OH \ 


11 




X ■ • 






s 


0403/1 Od/ 


11 




X ■. . 








0403/1 OJ I 


11 




X 








0403/1 OK ] 


08 










• 


0403/1 OK 1 


■11. 




X" 








0403/1 IF 


08 




. x; ; 
• X ^ 






• 


0403/1 ID • 


08 




i • ^. 


• 




0403/11 D 


11 












0403/111 


08 




X 








0403/1 1 1 


11 




X 


• ■ V ■ 






0403/1 1G 


11 


» 


■ X 








0403/11 A 


08 




■ X 








0403/11 A 


11 




X ■ 








0403/1 1C 


11 




' X 




» 




0405/2 * 


11 


X(E) 




. .X (B) 

• 




- 


0405/5B 


08 


>^ (E) 


. 


• 




0405/5D 


11 


X (E) . 










0405/56 


08 


V 


X'. 








0405/6C\ 


• 08- 


• 


X 


% < 

r 






0405/6CJ 


11 




. X 








'0405/6b(' 


08 




X 






* * , 


'040576Br 


'11 


• 


X 








0405/6A\* 


CJ8 




X 




* 




0405/6E / 


11 




X 








04 14/1 3 Ar^ . 


08 




X 








0414/13Fj" ' 


08 . 




X 








/0414/14E 


11 




X 






- 


0414/14D * 


08 




X 








041 4/1 4G * 


08 




■ *x ' 








04 14/1 4 A 


08 




X 








0414/14A 


11 




X 








041 4/1 4C 


11 




X 








041 4/1 4F 


11 




X, 








04 14/1 5 A 


08. 




X • 








q4U/15C 


11 




X 








041 4/1 5D 


08 




X 
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BAD FOIL + CULTURAL BIAS 

Booklet 42a, Item 2: ' ' 

Imagine that you began managing a local volunteer project for development of a park in your neigh- 
borhood. The pVoject Involved much time and planriing for getting jobs done^y other people, including, 
earth movers, planters, tree trimmers, electriqians, etc. You find that more and more of your time is. 
taken up with this project. Problems arise and it is difficult to get cooperation from 6thers. You fe^l 
^ discoyraged and would like to drop the project. . • . ' 

Which OHE of these statements shows a BENEFIT you- might gain by staying with thfe project? 

(A) You will make a lot of money if you stay with the projecrt. 

(B) You wiil make a lot of, new friends if you stay with the project. / • * . 
(0) You will be asked to serve as chairperson of other volunteer*proj^cts. 

(D) You will gajn some personal fulfillment if ^ou achieve your goal. . ^ - • 



ERLC 
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This IS a very easy item. Foil (B)" is not drawing anyone and should be replaced. The cultural variation i6 
primarily due to an avoidance by Mexicari-Amencans of the idea that they mrght be asked to be chairperson of 
ariytttifig. This whole concept^ seems to be less thart concrete. As result jt is difficult to measure v^ith any 
degree of success. * . k • • ' ' * ♦ 

" ■* . * • ■ ■ • . • ' • ^ 

DIAGNOSTIC ..«♦.' . . . 

Booklet 42a, Item 6: ^ , 

6. On your Answer sheet, indiciate whether you strongly, agree, agree, are undecided, disagree or strongly 
disagree. with each statement below by darkening the letter as follows: . * . ^ ' 

' . . " . _ . _ : • ^ 

STRONGLY . . • . : , STRONWLV 

UNDECIDED . • . DISAGREE DISAGREE 
tC) (D)- , ■ *(E) ^ 

* V ' / - • • - * ' ' ' 

A person should practice disciplining himseif/herselt to complete 
tasks which should be done bit are unpleasant. 

A person should stay with a task which^s boring but must be 
done. - ^ , * ' ' 

If a person has a lot of work to dp, h3/she should not complete all 
tbe-wdrk. * ' - ' , ' ' 



When a person tias a job to do but also vyants to do something for 
.fun, he/she should firiish the job first andgai it out of the way, , 



AGREE - 
(A) 




AGREE'. 
(B) 




(A)' 


(B) 




(D) 


(E) 


a. 


(A) 


' (B) \ 


.'(C)" 


(D) 


,(E) 


« 
b. 


(A) 




(CI 


.(P) 




c. 


(A) 


(B) 


.(C) 


(0) 


(E)" 


d. 


* 

(A) 


. (B) 


(C) 


<D) 


•(E> 


e. 
> 

* • 


(A) 


(B) 


(C) 


''(D) 




f. 


(A) . 


. (B)- . 


'(6) 


(D) 


(E) 


g- 


(A) 


(B) 


IC). , 


id) 


0) 


h. 



son does not want Jo do them. ^ 

V . . • . ^ r 

"A person pfobably feeis good about himself/herself if he/she 
sticks with a* task until It is complete. ' ' . ' . ' * 

, A person should not atternpt to cornplete .a task which he/she 
• does not^-thin'k he/she "woulcf like'. , > , 

The information mjhi^ item.iB useful in identifying tfie ba|jeJs of siudev^^ and the strerjglhs oflheir lDe)ie(Sr 
^lack students tended to disagree-wlth part a/ - ' ^ ^ c » 

' ' . ; * ' " 'l* ' ^ ' 

The^e data should, bfeHiseful in*plannmg instructio^nal strategies. " - , v . ^ ^ 
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DIAGNOSTI C 



Booklet 42a, 
. Read 



tern 10: 



he follpwing list of word§ and phrases! On your Answer Sheet, darken (A)' for each Item which 



may hpve influenced your attitude toward' work. Darken (B)Jor each item which probably didmot. 

(8) a, readmg ^ 
(8) b. mathematics ' * ^ ^ ' . t . 

(8) c, ^athletics* . . • ^; * ■ * ^ * ^ / ^ ' ' t * 

(8) d.' iex' ' . , ' : " . 

(B). e. age • . * . . ' * ' ' * ' • 

(8) i/, family ' ' " • " , :\' • . ' 

-(B) g. socio-economic t)acftground^ * ' 

(8) h. educTation , ' \ , . ' ^ ' ; \ 

(8) 1. work experience • . ^ ' • ' . * *• / . 

(8) J.. »^ulture ' • . 

(B) ' k. peers tfriehds) • » ^ v ' ' ' * 

(8)' I. media; television, jnotion pictures^ newspapers, magazines, eta 
* . ' * ^ 

There are no correct answers for these items. They Jare, purvey questions which "naturally have different 
.response patterns for different sexes and ethnic grdup/s since each perso/i's attitude toward^is influenced by 
differernthinga. ' , , . v ' .. , ■". 
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t18/12C 
0418/12'E * 
f0418/f2D * 
Q418/12B . 
041 8/1 2B 
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Bias(Type) 



: X(E) 
'X (E) 



Diagnostic 



X 

=■ X 
* X'. 
••X 
X 
X 



ead FoUm 



K{C) 



'V X,(A) 



Bad- Format 



IvfoClear 
Evidence ' 



BAD FOIL CULTy/lAL BIASr 

Booklet 4^ b. Item 8: 

* . ' 'VVhijCh §tatem(^nt reflects a POSITIVE ATTITUDE towar.d lawyers? 

/ (A)' Lawyers take, advanj^ge of people in trouble. * • • 
' ' / (8) Lawyers belp peppJe deal fairly •\y4th^ach other, - 

^(C) ; Layvyers will not help people wHo do not havB money to pay legal fees. 
(D) ^Lawyers fjelp ijeople to cheat on "their income tax. 

Tbere are two.pr^oblems wfth-foij "(D)".^(1*) it is not a positive attitade^and (2) most peopip ard aware of th^ ^^^*'/x 
that lawy^rs'ar^ strictly acoountable to staying within t6e law a«d thai to help you cheat on income tax is out-^ 
side^the law. As a, result, no one chose this folH Foil "(Cj" is not attractive to lyiexican-Americans. 
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DIAGNOSTIC . — , t 

Booklet^42b, Item 12: * • • . . • » 

/ On your Answer Sheet, indicate whether you~^rongly agree, agree, are undecided, disagree, or 
r strongly disagree with each of the following statements by darkening the ^tter which you feel is ap- 
propriate as follo\/s: . _ ' , 



STRONGLY 
^ AGREE 
(A)' • 



AGREE 

-(B) 



UNDECIDED 
. (C) 



DISAGREE' - 
(D) 



STRONGLY 
DISAGREE 



(A) (B). (0) (D) (EX a. Being-a lawyer is a more useful occupation in society than being 
: ■ " > a mail carrier. ' 

Artists perform-u^ful tasks in our society. ' 

c. Auto mechanics havfe less dignity than teachers. . 



(A) 


^' 
(Br ' 


(0) 


(D) 


' (E) 


b. 


(A) 




(D) 


(E) 


c. 


(A) 


(B) 


(0) 


(D) 


: 


d. 


(A)" 


(B) 


(C) 


(D) 




e. 


(A) 


(B), 


(C) 


(D) 


(5) 


f. 



The dignity of a job depends on the salary involved. 

The dignity of a job depends on the quality of performance of the 
people involved^ 

Jtem 12 IS a difficult format for many students. Some of the statements are about thifrgs that are tradition 
bound. Such as part (a). Minorities agreed with (e), showing that they have grown up. with the idea that*high 
'class white tollar jobs are more lisqful. 
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CULTURAL b1)^ - 

Booklet 51, l4m 8: 

A team of people was chosen to discuss school bus routes and solve problerris with time schedules. 
The team had a hard^time arranging a plan of action. Everyone talked at once, argued bacik and forth, 

• - ' 74 ' ' . 
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) 



1^ 



and did not listen to the chairperson. Each member voiced his/her ideas to one or two other members 
rather than directing his/tier comments to the entire group, At the end of the project, the committee still 
had not agreed upon a clear-cut set of suggestions. . \ - • 

On your Answer Sheet, darken the letter which shows to wjiat degree this group of people worked with 
each other as a team. 

(A) They had a very effective system of procedure as a team.^ 

(B) They had the makings ot a good team, buKone Ofslwo people spoiled it. 

(C) They were not effective as a team. ^ — — ) ' ' " . *v 
(Q) They would have been effective -as a'^eam had they had mt)re time to work. 

Blacks ten'ded^to select foil "(A)" which is the opposite of the situation that is true. This could be symptomatic 
of a gang approach to decision making where only one or two key people are fnvolved in making decisions. 
This would cause members to only speak to those one or two key people who are in control. 

Also; in many city gangs, there is arguing among members during times of decision making with no 9lear pur- 
pose being defined by the group. This would, then, seem to be an effective procedure to those with inner city 
experience. , ' - v * 

BAD FOIL + INAPPROPRIATE KEY / • 



Booklet 51, Item 10': 



"Suppose you are part of a team assigned to recommend special units of study for the drafting <Si a 
building construction project. Frank, the chairperson of the group, seems to be losing interest m the 
project. * / 

For each of the three questions below, darken (A) on your Answer Sheet if your answer to the question 
is "Probably so." Dar.ken (B) if your answer is "Probably not." Darken (C) if you do not know what you 
would do. 

a. will yoCi asli Frank to let the person whom you thought could do a better job be the chairperson? 

(A) . Probably so 

(B) Probably not , * 

(C) I doa||||npw what I would do. 

b. If you think a^w|gested unit is not a good one, \yill you volunteer your opinion? 

» (A) Probably so - . , 

(B) Probably not / ^ * , 

(C) I don't know what I would do. . ' ' . * 

c. Will you agree to recommend only.units that the majority wants? \ 

(A) Probably so . ^ 

(B) Probably not ' . , ' 

CC) I don'^t know what I would do. • ' • 

Part a is keyed "(B)". Many students chose "(A)", which may indeed be a more appropriate response. It may be 
argued that if a person loses'interest in a project that he/she is in charge of, it is appropriate to suggest that 
that person be replaced with someone else who has a much greater interest in tlie projegt. 

DIAGNOSTIC ♦ / , , ' /\ 

Booklet 51 , Item 1 1 : 

As a pa^t of a social studies project, a, class divided into groups of seven to write answers to social 
situations. Your group was give\;i three questions to discuss. As a member of this group, how would you 
probably have worked in the group? • . ' ^ 
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(A) 


(B) 


a. 


(A) 


(B) 


b. 


(A) 


. (B) 


c. 


(A) 


(B) 


d 


(A) 


(B) 


e-. 




V 




(A) 


(B) 


f. 



On your Answer Sheet, darken (A) for each of the statements that you, think is true about yourself." 
Darken (B) for the ones that you do not think are trua about yourself.' 

[ would be a leader in the group: * 

I would not be a leader, but I would be active in ei^pressing my feelings. 
I would go along with whatever the leaders decided. 
If the group couldn't decide on, the answer, I would take a vote and write the 
answer favored by the majority. - * , 

I vyould prefer not to participate actively, but I would be willing <o write the 
answers. , ^ 

I'm not sure how I would work in the group,. . 

In this Item there are no correct answers and Jthe mam purpose of data gathered in this item is detecting dif- 
ferences in values across cultural groups. In part "e" Mexican-Americans respond A/vith a higher proportion of 
'*Yes" responses* \ ^ ^ ' , 

Many girls marked "Yes", to pari "b'' indicating the lack of interest in being thre leader of a group. This is con- 
sistent with traditional sex roils. *The "interaction" that exists here could be changed .with the new roles 
emerging due to the womeTTs rights movement., ^ •/ , , - 

BAbFOIL " . " 

Booklet 51, Item 17: ^ ' ' 

Juan and his boss are walking to the front door of the building where they both work. His boss opens the^ 
door for Juan and motions for him to go ahead into the building. What would be the best thing for Juan to 
do? , • * ' * , 

(A) say "No, thank you'," and wait until his boss goes ih ^ 

(8) go in and apologize to his boss for not opening the door for him 

(C) go in and'laV^ "Thank you" 

(D) go in and say nothing but watch for a chance to open a door for his boss 

.Although the probable intent of foil "(B)" was to identify those students who wouid demean themselves in front 
of the boss, this is a highly unlikely occurance in these days of equal rights. It is particularly notable that the 
girls were thrones least likely, to select the foil. 

BAD FOIL ' . ^ . . 

Booklet 51. Item 22: - . - 

The film ran a httle late during third period, so the students left without putting the chairs back in place. 
This was - . - 

(A) a polite thing to do because the teacher would not mind their leaving the chairs out. 

(B) not a pofite tHing to do because those leaving or.coming into the room could stumble over the clut- 
ter of chairs. - * . 

(C) not a polite thing to do because they coufd have leh before the film was pver. 

(D) a polite thing to do because they knew that the students in the next period would have to move the 

chairs anyway. * ' ^ 

* /• 

The logic in foil "(A)" is not sound. Very few students, even at grade 7, are going to equate "politeness" and 
"j^ot minding" on the part of another. Politeness is an action that results in appreciation not passive ac- 
ceptance of inconsiderate action. ( ' - 
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BOOKLET 52 
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BAD FOIL 

» ^ t ^.^2 — - J 

Booklet 52, Item 2: * » v ' / 

An office staff of about fifty people was planning to have a Christmas party. Which 0A/£ of the following 
means of communication would be the MOST effective way tqenstire that everyone in the office, would 
know about' the party? * ' * * . * 

(A) , posting a bulletin board announcement ^ 

(B) telling a few workers to pass the word to the othetB , . ♦ 

(Q) passing around a written notice ^ ^ ' 

(D) making telephone calls to employees' homes , ' ^ , - 

Word of mouth is commonly 1<nown to be a poor way of sending information. It is often inefficient, inaccurate, 
and incomplete. Most people pwbably knew this, and so foil "(B)" is an unlikely response. 



NO CLEAR EVIDENCE 

Booklet 52, Item 16: " . ' , / - 

Read the following descriptions of people interacting in work situations. Which description do you think, 
is the S£S7 example of RESPECTFUL behavior belween people of differ-enb races? 

(A) Mike, a black, and Charles, an a^glo, worls together on a government research pro/ect. When Mike 
and Charles disagree, Mike goes directly to the supervisor to complain. 

(B) Mr. Green, an angle, and Mr'.Swartz, a black, have wbrketJ next to each other on the same job for 
ten years. Mr. Green and Mr, Sv</artz have seldom talked to each other., ^ ^ ^ ^ 

(C) Fred Bear has worked in a factory close to tjie Int^ian reservatior#for five years. He. has been a 
faithful >andjTard[ working employee, 'Mr. B,ear^5r^§Tt) take Thursday off from^work to attend a 
tribal celebration. The boss has threatened to fire him if he takes that day off. 

(D) Mei Lee lives and works in Chinatown. Sally Sands, a college student, has been hired as a Summer 
^ employee at the plant whpre Mei Lee works. Mei Lee introduced^ Sally to other workers on the job! 

This IS a very easy item. Sometimes this results in chance patterns of cultural variation. There doesn't seem to 
be any clear evidence of bias in the item. It Is not clear why girls would be drawn to "(B)*' and boys drawn \o 
"(C)", ' , • . . . • , 
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BAD FOIL 

Booklet 52, Jtem 30: 

Listed below are attitudes or beliefs expressed by some people. Select the belief/attitude which you 
^think MOST indicates prejudice. 

(A) People should be judged by their performance. 

\' (B) People with long hair are generajly lazy. 

(C) It is difficult to know what people are really like when you first meet them. 

(D) Most women have good eyesight. * ' ' 

Foil "(D)" IS an inappropriate foil that seems to have been thrown in because something better could not be 
thought of. It IS better to reduce the number of foils instead of including one that is ineffective. A better foil 
might be "It is easy to judge people after you first talk to them." 

DIAGNOSTIC 

Booklet 52, Item 31: ■ . ' ^ 

k • ^ 

-Which OWE of the following statements describes what might happen if the, people of one race are 
PREJUDICED against the people of a different race? 

(A) Communication will increase between the people of different races. 

(B) People of different races will like each other better. 

(C) Clashes between the people of different races will decrease. 

'\ (D) Understanding betwjeen people of different races will be hard to achieve. 

This item has foils that are diagnostic of a clear understanding of what is meant by "PREJUDICED". A 
response to a wrong foil shows that there is confusion on the part df the student abdut trte term and indicates 
the direction of trie confusion. . 
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DIAGNOSTIC 

Pooklet 61,*ltem:3: 



If youtiave problert)s or need advicp, people w>th professional training can often help you. Darken (A) 
below ^f there is someone on' the school staff to whom you would }eel free to go if 'you harf a problem 
c6ncerning the following. Darl<en*^B) for the others. , ■ ,. ' ' " " -• 

your schoolwori^ • • , 

your tiome life ' .- • ■ " ' , r - , 

a career choice ' ■ ., . ': '• < ■ 

your personal life .;•«. • 



.(A) 
(A) 
(A) 
(A) 



(B) a. 

.'(B) ,b. 

(B) c. 

(B) d. 



This item has no correct answers. Reports based upon tpeiiaXa collected are useful in diagnosing the 
Willingness of a student* to utilize school staff for various typtes of personal problems. It is .certain that 
willingness to use school staff is going to differ arfiong all types.of students. Data on this<tem could be used to 
identify stude'nts who need to be'nciade aware of the types of help that a staff member can giv^ as well ag their 
willingness to give the help. ^ ^ \l 

CULTURAL BIAS + BAD FOIL ' ' ' ' 

Booklet 61, Item 1: , ^ ' ^ ' - 

Jeanie has been worried about her relationship with her boyfriend. Her parents,doa;t like him and this 
adds to the problems that already exist. Jeanie cannot concelritrate in schooj and her teacher is worried 
about h§r work. The only person she sees and talks to almost every day is her young'aunt who happens 
to be a counselcfr at\her school., She ieels that she must talk.to someone about her problems. 

Of the following, select the ONE person with whom Jeanie would probably first 'discuss the problem. 

(A)* her parents . . ' ' ' * ' , • 

(b) her boyfriend ^ ' . ; ^ 

(C) her aunt, the school counselor ' " ' ' » v ^ ' • ' 

* (D) her teacher • . ..' ^ . ' 

This item is very^ highly tied to culturai.background. The person a studept is most likely.to gd to,first to discuss 
'a problem differs according to the background (ethnic and otherwise) of the child. For example.blacks'were 
more likely to discuss the problem with parents [foil "IA)'j, Mexican-Americans were about eve/ily divided* 
be'lween parents "(A)" and boyfriend "(B)" whHe "others" (mostly anglqsj were heavily attracted to foH "(B)". , 
Moreover, foir*(D)" is a very unlikely choice. - ' \ • ^ ■ ' . , / , 
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CULTURAL BIAS + BAD FOIL 

Booklet 62, Item 8: • 
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^How were the two lunches probably paid for? ^ " ' 

(A) The' cashier liked them and paid. 

(B) The teachers would pay for the lunches. , 

(C) ' The school kept a special, lunch fund. ^ * • 
i (D) other students joined In and paid.for the lunches. 

This is a inappropriate item. None of the foils are viable choices. Moreover, blacks were attracted to Foil A. 



BAD FOIL + DIAGNOSTIC 



Booklet 62, Item 12: 



1 





The nursing home Is used for what o^ whom? 

(A) older people who cannot take care of themselves 

(B) babies who cannot take care of themselves 

(C) young plants that people buy for homes and businesses 

(D) people who are training to be nurses 

Although foil "(C)" could be a common word confusion (nursing home for nursery) no one is aWracted to it. The 
foil shoulel be replaced with some other idea. 

The other two foils are good diagnostic statements which would help identify student problems. 



ERIC 



81 

81 



r 



BOOKLET 71a 



1 

uuicome/ 
Item Number 




Dias^ \ ype^ 


111 Q/^Mrtofl^ 

uiaynusuc 




• Rarl Format^*—** 


No Clear 
^ F\/irlpn(^p 




yjf 






Y /R^ 
A \p} 




Y 

A 


070i:/ 1 D 


1 U 












n7no /on 


1 u 


" Y 

A ^C^ 




X (A) 






f\7f\A /in >K 

U/U4/iU ^ 




Y /P^ 
A {C,} 


• 


X (A) 


• 




f\7f\A /111 

0704/ni 


n7 ' 




' Y 
A 






* 


nTrt/t /■< 1 1 
U7U4/1T/ • 


lU 




Y 
A 


A 






Uh/ lO 


1 u 










UfUD/l 


C\7 
Uf 




Y 

A 






, 


0705/3 


07 




X 


X(G) 






0705/4 


07 




X 








0705/6 


10 






X (P) 






0705/7. 


07 






X (0) 






0705/7/ 


10 






X(C) , 






0705/8 * 


10 










X 



BAD FOIL + DIAGNOSTIC 

Booklet 71a, Item 1: ^ . 

In the United States, we all have "freedom of sp^ch." This means that we have 

(A) freedom to say anything we want to, anytime, and about anyone. 

(B) freedom to speak our thoughts, but not to pUt them into printed form, 

(C) freedom to say or print what we want, as long as it is not false information. 

(D) freedom to appear on radio or television whenever we want. 

Foil "{□)" IS a bad foil which attracts no one. It is clear that nearly everyone knows that it is difficult to appear 
on .television. ^> , ' 

The other foils are diagnostic in that they are very common misconceptions, or misinterpretations. They can be 
specifically taught by the teacher. ' . ' , ' • 

BAD FOILS 

Booklet 71a,Jtem-7: 

The United States Constitution guarantee^its citizens many rights and freedoms. However, citizens can 
only hav^these rights as long as they 

(A) remain registered voters Jn the U.S. ^ . 

(B) do not infringe upon the rights of others. 

(C) are either fully employed or^re in school. 

(D) have never been arrested for a major offense. 



r 

f 



Foil "(C)" IS very unattractive to studenls of both grades. Why not use something like "are living in their home 
tov^n when^the electio^is held'\or "are living in the United States?" ' ^ / ' 



NO CLEAR EVIDENCE 

Booklet 71a, Item 8: 

You have a number of rights that are granted by the government. Which ONE of the following is tkOJ one 
of those rights? * ^ 



ERIC 



(A) the right to free speech 

(B) the^right to print money 

(C) the right to an education' 

(D) ttie right to a trial dy jury' 



82 
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There i* m tHear eytdence in the, ttem or the response -pattern about^the stf ong^vtdejf^ vanabrftty. 
Part^of this may be due' to the fact that the item is very easy, thus Jeavmg onTy^ scant Viumbers to respond 
hapazarcjly .to the foils/ • ,| . . ... , 



BAD F^OIL + CULTURAL BIAS + DIAGNOSTIC . 

i.>, ' Booklet 71a, Item 10: ^ ^ = . 

Harold tjas a good job working as a delivery man for a parpel firm. Gina works as a checker in a super- 
market. They have beenmarried for four years. Last year thqy borrowed money from the bank to, buy a 
"small home. Gina is'tl]j^tgg about quitting her job. The MCiST probafcHe result of Gina's not working 
would be that 



• (A) 
(B) 
(C) 

.(D) 



Gina and Harold ^f&tjl\l probably concentrate on furnishing their new home more quickly, 
the bank would probably repossess their house. * .t^ " 

Gina and Harold vyould projjably use their charge accounts more. . ^ . ^ 

Gina and HaroJ<^ would probably buy fewer luxury-type items. 

There is a large minority response to foiKf "(C)" which encourages the increased use of charge accounts. 
Although this foil may indicate an Qthmc variation, it could be considered diagnostic of a need to educate cer- 
tain groups to the need to utilize charge accounts with care. . j 

Foil **(A)"'is totally ineffective. A^tronger foil should be written to replace it.' ^ . x 



DIAGNOSTIC ' \ 

Booklet 71a. ltdm l1- y ^ , * 

Some people do not have work to do. Which ONE of the following 'is the MOST LIKELY effect of not 
working? ' r ' r 

People who do not wprk 

(A) will probably not experience the personal satisfaction they might have.expenend^ddomg a job. , 

(B) will be totally unhappy and very»poor. / . . ) 

(C) will want to begin working at'any kind of job right away. ' *\ ^ 

^ ^(D) will feel that no working is so great that they will encourage everyone they4<now not to work. 

The foils in item 1 1 dicect the teacher to information that would help the student realize the value of the "rjght" 
job. It also will help di/ect teaching to orient the students toward* underst^ding 1) WM are.the effects of not 
working?, 2) What are the effects of having a job you don't like?, or 3) Why would you not like being out of 
work? / • i • . * ' ■ , 
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Booklet 71b, Item 4: 
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Foe many^years, thousands ofjA/dmen have, worked m the lower^aying jobs in busines^/Regardless of 
their experience or education^it has been difficult for them to advance to, or be hired for. managemfeht- 
level jobs/lhe women s liberation movement has done much to expose this waste olhumaa resource^ 
and to make such discriminatory .practices unlawful. * ' . / 

Which 'OWf of the*' following is an improvement m our economic system that should result from the ef- 
forts of the women's liberation moveifn^nt? , • 



(A) People will be hired^ccording to qualifications. 

(B) More men will choose to do manual labor. , 
^|.(^) ,Secretaries will be paid lower wages. 



More men will be hired as business managers. 

Foil (C)^is a direct contradiction to the information m the ptaragraph. This has resulted in an extremely low 
response the foil. A different incorrect response Such as only wqmen will be hired into the higher paying 
jobs" would ^e more appropriate than the current foil. * 



CULTLTRAL BIAS * ^ ^ . 

Booklet 71b, Item 8: . . • ^ * ' 

People have become more aware of the frequent inequality in wages, of paid mgle and femate em- 
ployees doing the same type of work. What effect has this Tncreased awareness had on our economic 
« system? ^ " ^ ' - . , 

(A) upward adjustment of some women's wagfeS , • ' - ^ 

. *(B) reduction of the npmber of employed women, • ' , . 

(C) downward adjustment of the gross national product * - 

This Item seems to be'overly negative towar(Jthe ,eT[fects o^ th^ wpmen's movement. There seems to be liMIe 
explanation that can clarify the pattern of incorrect responses, however. 
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DIAGNOSTIC 

Booklet 72a, Item i\ 
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Which of-the following is paid for by state taxes? 



(A) 

<C) 
(D) 



postal ^rvice ^ 
national defense' 



highway maintenance . - , 

telephone service . , - . ^* • - * 

This IS an excelled example of a diagnostic foil item. It identifies a misunderstandkig of thVsqurfe^.o? finan- 
cmg.,for various public agencies\Any^.student not understanding* taxes would likely be drawn off by the foils, 
giving leache^rs informatfon fo be usecJ to correct the cfefi^ciencies. 



DljA-GNOSTtC 4^ ClirTUft^l Btfffr 



Booklet 72a, Item 12': ' * . . 

' • . * ^ , \ • 

^Whictt 0/VE of the following (Quotations reflects an mdividuars positive a\X/^e toward participation in 
'the ec6nbmic system of the 'Unitpd, States? J ^ ^<j^^ 

(A) , "Big businesses chealt on their taxes, so f do too." ' ■ ' ' 

f (B) "-"Insh wool IS of better qualiiy^han local wool." 

(C) "IVe* invested my savings in a local corporatibnv" ^ - * . ^ - " 

(D) "Pthink that I should te ^bleto get'mqney any wayj can/' , \ 

flacks are drawn to "(A)" and Mexican-Americans.^ to '\Dy\ If minority mdivldlials have experienced 
discriminating practices, this mighl explain the aforententioned variation ir\Jb\\ responses^ Howevef, all of the 
foils rfi-e diagnostic and ban be used as instructional guide lines. Ah improved pcnrrect answer could reduce 
s^ultural variation. ^ ' * 

DIAGNOSTIC ' ^ , 

Booklet 72a, Item 6; . • ' . * ^ 



Which ONE of the following levies taxes? * - ^ ^ 

, (A) counties , . ^ . • ' . * 

(B) churches • ' ^ 

(C) ' bank^ . . ' • - 

I (D) stores - - . , , ' • . 

^ ' ^ • ^ . t . 

This IS a very simple item which requires knowledge of two things. 1 ) j/Vhat are taxes? and 2)What charges 
ar^ legitimate for counties, churches, banks, and stores to makp for services rendered? Each one collects 
money but only or?e levies taxes, the counliesi 

■ ■ . > ■ ' • \ 
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CULTURAL VARtATION" 



1 



i'B'obKIetT2b7 Item 8; 



Whi6|j ONE of .the-following is an example of §oil conservation? • 



,{Af ^ tiilvyell drilling 



» (B) contour farming ^ 
' ' (C) open pit mining , 
(D) lumbering ' ' - 

•This Item is oriented toward jobs thathave been tjaditjonally male-dominated. As a result, the content jnakfes it 
difficult vfor girls to know enough toj^espond to any ^nswgr. This variation' cfould be corrected by Improving the 
instructional program. J 

BADFQIL' 

Booklet 72b, jJemU: ' • ?^ . ' . ' 

Which 0/y£ of the following would be a.good safety-practice lor employees to observe whej^ working in 
*, a' factory? . , ' r 

' * Employees' ^houjd' keep bathroorn doors locked at all times. 

(B) ^Employefes should be able, to follow fire drill procexJures quipkly. 

(C) Employees should bring chairs from home if those supplied by the' company are ul^fcpmfofable. 

(D) . Employees should organize apd.dem^nd higher wages. 

The Item components are oot related in a meaningful way. Nbne of the foils relate to anythind tjed, to a safety 
' 'practice. ^ ' y ' 
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DIAGNOSTIC 

Booklet 81a, Item 6: ' ' 

*' * , ' ^ " 

^ With new machines and cdmpufers changing routine jobs, ^ome asserhbly line and office workers may 

. be fearful of • ^ * 

(A) overproduction •qf goods. 

(B) losing their jobs, - » • * - , 

(C) increase in cost oT goods. ' - • , " . ' , ' ^ 

(D\ longer working hours, • ^ ^ ' v • . ^ 

The Mexican-Americans were attracted to^foil "(D)*',' "longer working hours." This is probably due Jo the lack of 
knowledge of computers and a lack of experience. For many, it could nrfean^hat longer working hours would be 
required to learn how to use the machinery. This, would be a common misconception for someohe who has not 
been oriented toward mechanization. » , ' , ' . . V 

■ . ■ ■ 86- . - ^ . _ ; ,, , 
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~ "iBtiOKrersra.'iterfi 13:- 



T r.eason wh^ many corhpLantes chbose^^Ui pay. salesmen on the 



, ' \A^hich ONE of IH? following re the 

basis of 4icw' much of th^'company's pr idiict -tfTey^^el!? 

(A) Such pay will not-show' on the conpany/s records*, ^ , ' ■ . ' ' 

(B') The s^.lasmen will make mare'rT\cney if they'are paid that way. 

-* I ^(0) ' The salesmen will sell more of t^^ product if they are^jaid'that way. ' \ 

*.((>) Th^ salesOien'will not h^ive to be paid any fringe benefits. * ' /- 

Fbil "(Ar not attractive to the students. It s^emsthat all stu^lents are' aware Jthat aJI payrx>ll that is "i^gular" 
i$.recorde<;j'b©th m company recdfdaas welhasjn tax recorcts. Koils need td'^e likely dioices. 

The other two foils provjfJe likely reasons for paying a comm^sion. A studj^nt who chores one of.the&e foils has 
a-definite" lack of understanding that cgn be'rectified with some instruction. . . T ' 



DIAGNOSTIC 



Booklet B1 a, Item 21: ^. • ^ " .c , r • * 

Resources beoome goods when they ar§ madeYeady for human use. Water ir\ay be Cdosidered goods" 

rather than a /esoul'ce when ' ' ' - ' ' ^ - 

" ^ is flowing in a river. ' • / ' ' . , . ' 

(B) . a dam IS built lb stop fjooding?*. ' " ' ' ' . -^l - . ^ 

^. (C)* it is piped 'intb your home. ^ ' . ^ 

^ XD) it is pQlluted by chemicals from factories' i, . ' ,^ • ' 

Each of,thefoifs represents, a logical misunderstanding. Blacks selectecj foil "fD)" frequently. If the stydent 

did^ not know*the 'term' ''goods", he/she may be, drawn off by the term ''pollution". • ' ' ' ^ 
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DIAGNOSTIC 

Booklet 81 b,.ltem 2: 



Eight 'brg logging companies. raised the. price of^rSw lumber by a jpirge amount. The housing industries 
felt they had no choice but to raise the pope of the homes they offered for.sale. What effect did the price 
change probably -have on the demand for the^ houses? , 

•(A)' The.cfemafid wa5 probably greater'. • * 

tS) The denr>and was probablyjess. / * ' 

(0) The demand, was'^robably the same. - • \ 



TfTg <yi,ippty and rlpmarrr rfprrra arp fftt jtragnn^ttr rf ttrpy tnwg nrrty tm rtrrm pnH<iitrrp rpyrftt^, pn qp^ stay ttre 
, -same, qo down. This is_a good, example, ptjinjt.efn which diagnoses a problem which instruction can sup- 
posedly rectify, A wrong response gives a clear direction for instruction. v. 



BAD FOIL + DIAGNOSTIC ' '1 • . 

Booktet 81b. Item 15: ^ - ' ' , . * • ' 

k ^mmer heat w^ave m New York caused people much discomfort. More people began to drink 
lemonade durmg the days of the heat wave. What effect did this action of consumers have on produc- 
tion? V ' • * • ' • 

* (A) More lemons were grown. • / ^ ^ ^ 

(B) More lemon trees were planted immediately. 

(C) ' Mora lemons probably were used to, make scert^ wax. 

(D) More lemons probably were .used to make frozen jufce. " ' . 

Foil "(C)" IS inappropriate. If ttiere aren t enough lemorrs for lemonade, surely they won't have eriough to in- 
crease the productipn of Hemon scented' wax. This foil should be replaced. * ^ 

the other two foils arepiagrWstic of a lack of understanding that>t takes a very long time to grow lemon trees 
which wiM bear fruit. These misunderstandings are i*portar]t keys to additional instruction. 
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DIAGNOSTIC + CULTURAL BIAS 



Bopklet 82, Item^: 




The Peterson family had to make an important decision concerning their budjget. They had not expected 
fo l^avd to^spend their savings for repairs on their storfri-^amaged house. The family had to decide 
whether to borrow morjey for their planned trip to the Rbcky Mountains or to spend their time at home 
and save for next summer's trip. The family realized that they would need to work-moce m order to repay 
any money'borrowed. After discussing {he problem, the fafmify voted to*take the trip they had planned. 

Which 07VE of the following was the.majOi' factor affecting the decisiorvof the Petersons'? 

• (A) The^ placed a high value on savirig money. ^ 

\ \B) They plBced<a high value on vacation travel. > . , - ^ 

(G) They placed a low value oh home repairs* . . \ , V 

(D) They placet! a loyv value on work."* - ^ — 

Foil ' (6)'" IS very attractive to lilacks, this may be due to culture jDack^jwjnd factors. Moreovex poorer readers 
prob^ibly didn't realize that the'repairs w^s^e already done. If the repairs^ehe not done, foil '^C)" could be con- 
sidered correct also. # * ^ _ " _ ^ - . ' 



Booklet 82. Item 7: ■ 

When Andy wcfe ^,,ff9f»bagan workmgjfor the Pioneer Motor Freight Company as a dock worker^His job 
was to load and'lJfriDad trucks. Arrc/y^'s gross income per week was $160.00 His take-home pay after 
deductions wasi $11 0.37. * • * ^ ^ 



After working at Pioneer for two years, Andy no longer worked on the docks. He drove the company 
trucks regularjy. Then his gross income per week was $200.00. 

. ' ■ • / 

As a result of rtiaking more money, Andy had 

(A) more money deducted from his paycheck than before. . ^ > ^ 

(B) less money deducted fr'om his paycheck than bejfore. 

(C) the same amount of money dedi^cted as before. - - . ' 

The cultural variability indices are very high for this item (especially at grade 11 whera^V^ 0.61). There is a * 
possibility that this was due to cuIUiral background factors. A more likely explanation, howevpf, is that the item 
IS diagnostic of the students' understanding of the relationships of changes la income to th^amount of deduc- 
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BAD FOIL 



Booklerpl. Item 3: . ^ : 

Which OMBp^ the following courses wQicrld probably be MOST helpful to you if you wished to be a t5ank \ 
> clerk after graduating frorri high school? ' • ' • \r . 

(A) chemistry ^ ^ ^ j - - " • ^ 

(B) bookkeeping 

(C) home economics • " ^ . * • ■ . ^ * , . ^ 

(D) American history , ' ^ 

This IS a very easy item. This may be due to the inappropriate fails used or because alt tenth graders know the j 
re(jLiir^ments for being a bank clerk. In particular, foil "(Dr attrajjted no students.. 
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DIAGNOSTIC 

Booklet 92, Item 1: 



* ■ To be*a legal secretary, one need NOT be able to " 

* (A) ^spell correctly. , 

(B) Mype accurately. ► *' . * ^ 

* (C) have a good com'mand of the English language. * ^ . - ' 
^D) debate a legal case. . . '* i ^ ^ * - 

This Item ha4 ?wo foils ['(A)" ^nd "(B)'j are not operating. For example, foil "(BK is too'easy because there is 
a tendency to associate typing with secretarial positions, wjiether it be legal se6retary or hot. 

DIAGNOSTIC 

i ■ - * * » 

Booklet 92, Item 9: .' . ' 

On yoyr Ansvyer Sheet, darkerr(A) for all those leaning e;tper^ences OUTSIDE of school which you feel are 
important to you m making a decision-about your career. Darken (B) for the others. 

'talking to parents afcjput their \ops \ ^ ^ * • . * 

talking to friends about their present or future jobs ^ 
seeing exam]&ies of jobs on.televison ' . ' . 

seeing people you don't know working qn various job& - - ' ^ 
reading books or magazines about people wi^th various jdbs 
•having had experiences with jobs after school 6r dufirjg the summer 
talking with relatives about their jobs, . * ' ^ 

jDelonging to a clufeor group — ^ 
participating in a sport ; 
traveling or moviTig to another city 
t6king»leLSsons in painting, piano, guitar, dancing, ^tc* 
visiting a jbb*locatioi? ' . > ' ^ ' 

working at home on a hobby or project . ' ' 

doing volunteer work^such as Candy Striper) in the community ' " * 

having had no outside/school experience which has been important , - 



(A) 


(B) 


a. 


(A) 


(B) 


• b. 


(A) 


(B) 


■ c. 


(A) 


(B) 


d. 


(A) 


(B) 


e. 


(A) 


(B) 


f. 


(A) -(B) 


g- 


(A) 


(B) 


>i. 


(A) 


(B) 


i. 


*(A) 


(B) 


J« 


W 


(B) 


k. 


(A) 


(B) 


1. 


(A) 


(B) 


m. 


(A) 


(B) 


.n. 


(A) 


(B) 
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This IS a multipart item with ^ -^^ ^ct answers. The TasuUs^^a^rimar.ily usecUojusu rva^u ^poses4a4deoti 
various learning experiences ()UY$IDE oi school that students have hacT. For this reasor^the responses are 



aiagnostic ot student experience. 



BAD FOIL 

Booklet 92, Item 12: • - * ' ' , . 

Reading the editorial seictiohs oi newspapers will give yo,u . *^ ^ , - ^ . * * ' 

^ (A) individuaJs, views pn various issues. 'r * ^ - • ' 

(B) .factual inforrhation only, ' ^ * 

'(C) the best Information ayailable on various issues. • ^ , ' v , 

(D) information concernir/g television schedullRg. , . , " 

Few students chose foil "(D)". This may be because the content of. the "foil is very different from that o,f the , 
other three. One alternative would be to use something like "a Nummary of the most important event^ of the. 
week* "or "current boQk review infprmation '. The second suggestion' would identify those who confused book, 
editor with editorial. • > • . , ^ - , ^ '\ • 
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SIGNIFICANCE TESTS^OR ISI AND ESI AND OSI 

One may test the statistical, significance of the difference in the ISI <ES1, OSI) for the experimental and com- 

. parison groups, by the following statistic. 

■ * « * . , - >. 

. - (p.-*p,)-(p/-P:') 



V 



♦ ' > * y 

♦ « * 




where 



Pj-P, ~ n — \ • . n— ,1 .^n. 



Qz P.' (I - P/) . ' - P^0 , ,2.P/ P/ 

-S' 

and (n, n-|, n2) refers to the experimental (Instruction) groyp.^wNle (n, rii,'n2) refers to the comparison (no in- 
struction) group. * " ' V * . ^ ' ' 

The above Z statistic is approximately (for "large" n, e.g., n greater than 30) distributed accorcfingio a stan- 
dard normal, under the null hypothesis (true index for experimental group is equal to true indexjor the com- 
parison group). The formulation utilizes well-known properties of the multinomial distribution, the formula for 
the variance of a linear combination of, correlated variables, and the central limit theorem (cf., eg , Wilks, 
1962,). A smgle-tailed test may be conducted and the upper-tail- significance probability calculated A 
significant Z (e.g., at ,05 or .01 level) indicates (i) the ISI of the experimental group is significantly higher than 
that of frie comparisorj group and,(ii) the magnitude and statistical significance ofjhe iSl may be,attributed 1o 
the itern's sensitivity to instruction, and. not some extraneous factbr, such as maturation (cf., Campbell and 
Stanley, 1963.). 

An analogous Z test may be conducted for the ESI and OSI (just,substitute m and N respectively for the n in the 
above formulatiori), .v . * ; * 
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APPENDIX J 



AN ALTERNATIVE PROCEDURE FOR DEVELOPING THE SURVEY TEST 

*" 

The following heuristic procedure is an alternative to the stepwise regression analysis; ^ 

%. Compute the mean outcome score for (i) studentsjn upper^grade (10 and 11) and (ii) students in lower 
grade (7 or 8) for eath outcqme. Denote these by Y-| and Y2, respectively. For each item (within a given 
outcome) and e^ch student, perform the following analysis: ^ 

1. If the student'fe score is less than the mean outcome score, (either Yi or Y2 depending on whtch level 
the student is at),, and he answers the item correotly, assign the rating 0. 

2. If the student's score is less than the mean outcome score and he an^^rs the item /ncorrecf/y, assign 
the rating 1. . ^ ' ^ • . ■ 

3. If the student's S^oVe is greater thar} the mean outcome scores^ and he answers Jthe item correctly, 
assign the gating 1. , . , » 

4. If the student s scOre is greater thar) the mean outcome score and he answers the item incorrectly, 
assign the rating 0, . * 

5. If the student's, score is exactly equal to fh4 mean score assign th^ rating 1/2. 

6. Sum ihe ratings'for'e3Ch item," over all students and over both grad^evels, 

7. Bel^'ct 'the iterfi with the highest ratifig, item fVli, say, (Denote this rafting Rmi)-'*^ " , ^ 

8.. Subtract the score for item M-) (Xmi f 0 if the'response is incorrect, XMi =1 rf the response is 
. correct) ffom the students outcome scores and recompute the ("adjusted") mean outcome scores, for 
\ the 2 grade levels.^ . * ■ • ' ' ' 

I 9. Repeat steps 1 through 7, witl^ the adjusted means, i.e,, perform steps f through 7 after eliminating 
\ item M-). ' , * 

10. Select the item (M2) with the highest "adjusted" rating»,using the "adjusted" means. Dertote this rating 

' RM2' * ■ ' * 

11. Compute the' ratio Rm2/RMi- ^ '-'J * ^ , • - ' 

Select the itetjis M-| anjj M2 dniy if R|^ and the ratio RM2/R1VI sufficiently Ngh. ^ . 
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APPENDIX K 



theoretical basis for the usfe of 'self-weighting' 
estimat6rs in the field test data 



The determination of the number of schools to be selecteci in each stratum is by "proportional allocation" with 
respect to the number of studerts in each strata. This may be stated formally as fdjidws: 



-n 



where ' ' , ' ^ ' 

n = number of schools'in the sample (for a given instrument at a given grade level) 
'M^ = number of students in all schools in stratum h (at a given grade level) 
fyi = total number of,stu<lents. in the state (at a given grade level) » \^ 

n^ = number.pf. schools to be selected from stratum h (for a'giyen instrument at a given grade level) , 

Jhe selectipn^ of schools within strata is with probability, proportional to size 6i sjchool (p.p.s.). This may be 
stated mathematically as follows: « . i 



c 



M 



hi 



ho 



i = 1, 2. , , Nh; h =-1, 2, 3, where Zhr's the prot)ability of selecting the ith school in stratum h, and Mhi is the 
number ot students in the ith stratum h, antWh is the number of s'chools in' stratum H,. 

The classical- unbrased estimator of the (true) *p-value (proportion getting an item correct) is. ^ 



(3) 



V 



^hi Phi 
"■-hi 



(cf., Cochran, 1963.) Substituting (1) and (2) into (3), bne obtains , 



94 



=;^S'o?._P==^Jl)e (unweighted) average of the p-values for each school. When m,^, = m, i.e., the n umber of 
stiidftnts selected from each school is jhe_same^ [^i/educes to: . 



(5) 




where X^| = number t^students answering item correctly in the i!h school in stratum h. The estimator p in (5) is 
th^ simple proportion, the number answering correctly divided by the total number of students taking the test. 
' -We selected one classroom per campus. Although classroom size will vary somewhat across schools, it was 
.the judgment of WLC/MRC statisticians that this would not markedly affect the estimates obtained using (6>. 
The usual estimates of point biserials and KR-20 reliability coefficients were employed. These ard considered 
as mea.sures describing various characteristics of the tests, rather than estimates of population parameters. 
Thus weighting factors vyere not considered for these measures. ' ^ ^ ^ 
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Texas Education Agency 




♦ STATE BOARD OF EDUCATION 
• STATE COMMISSIONER OF EDUCATION 
• STATE DEPARTMENT OF EDUCATfON 



201 East Eleventh Street 
Austin, Texas. 

78701 . 



Letter sent»'to all Executive Directors of the twenty education service centers 



The State Board of Education has identified Career Education as one of the top priorities for development. As 
a part of this priority, a set of important student outcomes m Career Education has been identified. Based on 
these student outcomes, we are now buiidmg a measurement system for Career Education that is describee) in 
the attached summary. We believe this system will provide information useful to you and your staff in the coun- 
seling and instruction of your students. . - ^ ^ ^ . 

The measurement system is in the developmental stage. Test items have been written and grouped into triaP 
instruments at two levels of student, development. In order to insure that these instrument^ are cff the highest 
possible quality, it is essential that they be pilot tested with a sample of Texas students'in grades sevfen 
"through eleven starting in mid-fvtarch. 

We have drawn a random sample of school campuses that represent different types of Yexas students. One or 
more campuses m your school district are included in the sample. Would you be^ willing to cooperate with us in 
trts effort by allowing some of your students to take one of these instrurr^ents?, It would require less than dne 
hour of class testingtUme (an ordinary class period) for each particip*ating student and an additional one and 
one-haff hours time for each teacher to prepare for the administration of the; test. 

Attached is a list of campus(es) and number of classrooms requested to participate in your school district. I ' 
would appreciate it if you would return, the enclosed form to let us know wjiether or not you can assist us. Lf you 
have additional questions or would like further information, please contact Keith Cruse or Bill Fischer of the 
DFvision of Program Planning and ^Needs Assessmen (512/475-2066). ^ * 



I'l^ope that you will feel that your school district can work with as on this important effort to strengthen the op- 
portunity for all students in Texas to achieve the essential' outcomes in Career Education. , • 

Very truly yours, . * * • 

M. L Brockette • v\^v 



Commissioner of Education 




"An Equal OpportupUy Employer" 
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Texas Education Agency 201 East Eleventh Street 

/ Austin, Texas 

f •STATE BOARD OF EDUCATION ^g^^^ 

• STATE COMMISSIONER OF EDUCATION 

• STATE DEPARTMENT OF EDUCATION 




Letter sent to all Executive Directors of the twenty education service' centers ^ 



As we described to you earlier m Texas Elementary and Secondary School Planning Council ^weetmgs. one of 
the Agency activities for the priority area of Career Education is the development of a measurement system for 
the "Basic Leatper Outcomes for Career Education. " Plans for the March^admmistration of this measurement 
system have6een revised to increase the usefulness of these tests. Rather than a statewide administration of 
the instruments, we are preparing to pilot test 22 developmental instruments which measure a set of outcomes 
from each of the nme categories of the basic learner outcomes. 

A small random sample of 84 school districts has been drawn for pilot testing these instruments. Attached is a 
^letter that we mailed to the superintendents of. the schools in the sample. Additional information provided to 
Yhese superintendents is also enclosed, along with a list of sample schools in your region. 

If you or your staff members have specific interest m this activity, we welcome your inquiries and participation 
as we proceed with the next phase of this project. Keitfi Cruse, Division of Program Planning and Needs 
Assessment (512/475-2066) will be available to respond to your questions and provide additional information. 
Further details will be provided to guidance and career education coordinators in futur6 statewide meetings. 

Yours truly* * " 



Charles W. Nix ' 
Associate Commissioner 
for Planning anfd Evaluation ^ 

CWN:jr 

Attachments 



"An Equal Opportunity Employer" 
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A MEASUREMENT SYSTEI0 FOR CAREER EDUCATION 

\ SELECTING CLASS GROUPS F^R INSTCIUMENT ADMINISTRATION 

An important purpose for the** pilot testing of the Career Education instruments is to get an accurate 
assessment of how all types of students at specific grade levels react to the instruments in general, and. more 
specifically, to the kinds of questions that are asked. The it^formation provided hy students from your school 
and from other schools will be grouped tog ether and used jo project h ow the instruments will-be used when ad - 
rriinifi tPrpH \n citntipnfc; ail nx/^r fhP fitatft Ac; ynii ran <;fift"if thft infitrnmentfi are tried only with one ty"pe^f 
Student, such a§" the top students in each school, the information wril give a distorted impression of how 
students^erform. You ar^ being requested to use the following guidelines when you sefect ciass(es)~for par- 
ticipation m the pilot testing. These guidelines are for the purpose of helping you select the kinds of classes to 
provide the types of students that are needed. In no way Is the overall performance of your school being 
evaluated. . ^ - 

Guidelines for Selection of Classes: 

The followirtg-points should be considered when selecting a class(es) for participation in the pilot testing. The 
class(es) should: 

• be representative of the ethnic make-up of the school. 

"~ contain students with a mixture of a^^ilities, not "honors" classes, that would lack an overall represen- 
tation of student abilities. , ^ 

• have a high majority of students at the grade level requested. It is realized that in high school it might 
be difficult to select a class that contains just one grade level of students. 

• have from 20 to 35 students. 
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0 INSTRUCTIONS FOR COMPLETING THE 

ASSESSMENT IN CAREER EDUCATION TEST EVALUATION FORM 

* 

Enclosed is the Assessment in Career Education (ACE) test evaluation form. This form asks questions about 
your perceptions and those of your students concerning the organization and sequencing of directions, in- 
structions, and Items contained m the Career Asseesment Instrument. For*this reason, it wilf be necessary for 
you to become thoroughly familiar with Ihe questions on the form before you have administered the tesHn- 

-^tAiment^=^ ^ - ^ ^ — 



il3ay,-additjonaljJomrnLent$J)r.spBC„e js^ needed to further elaborate on any of the questions on the evaluation 
form, please feel free to use~nien^mamia[^^^ msfrucTT^n sl^eeT/ln MditToTi; TOul?^^^ 

: mailer so that you may return the evaluation form upon com^pletion. - » 

COMMENTS: , •/ ^'^ * ' 



1 



4 
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Form No. 



Regional ESC No. 
Campus Name 



ACE TEST EVALUATION FORM 

-i: — P r e sen t a t ion oif Orientation Session ^ ^ 



Yes No 



ERIC 



1. I understand my role in and the purpose of the Career Education Assessment In- 
struments. * ' 

2. ^,. The orientation session was useful in providing answers to all questions that 

arose concerning the test and its administration. 

3. I clearly understood the instructions which outlined the t^sks I was to perform as 
- the test administrator. 

II. Instrument Design , * 

4. The items oh the test were in a logical sequence and well organized. 

5. After the students received the instructions for the iest instrument, did they un-* 
derstand what they wer^upposed to do? If they did'nof, what seemed to be the 
problem? 



6. Were there directions within the tes^ questions that at least three students did 
not seem to understand? If there were, please record the number of the item(s) 
and give a short comment about the problem with the item direction. 



Item No, 



Item No. 



Were there .words used in the test questions that at least three students did not 
know? If there were, please record the number of the item(s) and the word(s). 



Item No. 



Item No. 



8. Were there any items that offended any students? If there were, please record 
the number of the item(s) and comment/ 



Item No. 
Item No. 



9. Did the students have any problems answering \k\e Stu(Tent Information Form 
questions found on the back of the test instrument? Jf tjT§^ did. please identify 
which question and identify the problem. 



Ques. # 
Ques, # 
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lagement « \' . . . ^ 

10. Indicate the size of group in which the t^st Instrument was admlnisterWi. 



11. Were there any problems wltK^the format of the ^nswer she|it"^t^ ^caused 
students trouble? If there were, please identify the trouble. 



— "r* 



Student^lnformation 



12. Did you have'any problems in scoring' th^ open-ended jtem(s)7 If you did, please 
recdrd the Item number and comment about the problem, (if a^^plicable) 

Item No L 



Item No. 



IB. Do you think the information received from this kind* of item has enough value in 
relationship to the time it takes to score the item? (if applicable) 



Comments- 



14. Approxnmatisly what number of stude^p^s finished the test in: *^ 

20 min. 40 m^n. 55 min. ' -Did Not Finish In One 

Testing Session ' 



15. What subject do y^u teach? (main assignment) 




